excluding utterances with code-switching
jbang at stanford.edu
Mon Nov 19 18:52:41 EST 2018
We are working on bilingual transcriptions and had a question about code-switched utterances. Apologies if I've missed this in the manual.
One of our goals is to obtain an mlu for Spanish only utterances, excluding mixed utterances. For example:
*MOT: ahorita tienes que comer.
*MOT: no es time at s to at s sleep at s.
We would like to obtain an mlu (on the %mor line) excluding the utterance with code-switching. We've tried the following command, but this includes both utterances excluding the English words, where we'd like the output to consider the Spanish-only line.
mlu -s"[- eng]" -s"L2|*"
It seems like our options are:
1) go back to our transcripts and add a postcode for any code-switched utterances to use the +s switch with postcodes
2) use kwal to exclude utterances with the @s symbol similar to what is seen here<https://groups.google.com/forum/#!msg/chibolts/kdeVQEw7OZI/Siad8ni4SrEJ;context-place=searchin/chibolts/adding$20postcode%7Csort:date>
I wanted to know if there was a way to use the switches to exclude utterances with the @s symbol, or automate a way to include a postcode in our transcripts for every utterance with the @s symbol?
Thank you in advance,
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/MWHPR02MB32801F5C1CF989ABE63E162FD7D80%40MWHPR02MB3280.namprd02.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Chibolts