bilingual

Tue Dec 31 16:16:01 UTC 2013

Dear Leonid,

Thank you so much for your prompt and very helpful reply! Wish you a happy
new year!

Best wishes!
Ying

On Tue, Dec 31, 2013 at 7:17 AM, Leonid Spektor <spektor at andrew.cmu.edu>wrote:

> Dear Ying,
>
> First I would suggest that you use +/-s" [- zho]" option with both MOR and
> POST commands:
>
> mor +s"[- zho]" sample_English.cha +1
> post +s"[- zho]" sample_English.cha +1
> mor -s"[- zho]" sample_English.cha +1
> post -s"[- zho]" sample_English.cha +1
>
>
> Here are answers to your questions:
>
> 1. You can fix errors in any order you like, as long as in the end CHECK
> reports no errors found. If you are using ESC-L CHECK, then you will not
> have a choice of ignoring the first error found, because ESC-L CHECK always
> starts from the top of the file. It has no continue from current location
> option.
>
> 2. if you use "park at s" instead of "park at s$n", then MLU will give correct
> result.
> The TTR results could be wrong depending on your FREQ command. If you only
> run FREQ with +s"[- zho]" or
> -s"[- zho]" options, then result will be correct with either "park at s" or
> "park at s$n" choice. If you run FREQ without
> +/-s"[- zho]" options, then you will force FREQ to compare words, for
> example, "park at s" and "park", which are not same, and it will inflate the
> TTR result. If you use "park at s$n" choice and run FREQ on %mor tier, i.e
> use "+t%mor -t*" options, then result will be more accurate.
>
> 3. Your sample file had "[*tense]" instead of "[* tense]" code, notice
> missing space character. Maybe that was the cause for failure to find "[*
> tense]" code. I have added the space character and got correct result
> searching for "[* tense]" code with this command:
>
> freq +s"[* tense]" sample_English.cha
>
> If you want to count the actual words associated with code "[* tense]",
> then use this command:
>
> freq +s"<* tense>" sample_English.cha
>
>
> I hope this helps and Happy New Year!
>
>
> Leonid.
>
>
>
> On Dec 30, 2013, at 18:46 , Ying <yl5834 at gmail.com> wrote:
>
> Dear Leonid,
>
> I want to get MLU, the number of different words, and also TTR from some
> Mandarin(Putonghua)-English bilingual narrative data. Also I added some
> word and utterance level codes and want to summarize the codes. For
> example, for the following sample (I am attaching the transcript after
> running the commands),
>
> *CHI: [- zho] 我 去 了 一 个 <一个> [/]  park at s yesterday at s. [+ CS]
> *CHI: It is a very big one.
> *EXA: Nice.
> *CHI: My mom say [* tense] “We will come from time to time”. [+ GE]
> Note:
> The precode [- zho] is for Mandarin/Putonghua, as [- yue] is for Cantonese
> [+ CS] is an utterance level code, indicating code-switched sentences
> [+ GE] is an utterance level code, indicating sentences with grammatical
> errors
> [* tense] is a word level code, indicating a tense error
>
> Here are the commands I used:
> mor +s"[- zho]" sample_English.cha +1
> post sample_English.cha +1
> mor -s"[- zho]" sample_English.cha +1
> post sample_English.cha +1
> Esc_L
> freq +s"[% *]" *.cha
>
> Questions I have:
> (1) May I ignore an error and move to the next one when I run CHECK?
> (2) for code-switched words within an utterance, I don't care for mor info
> such as noun or verb. But I do want to calculate MLU and TTR. If I go with
> park at s but don't bother to make park at s$n, will CLAN give me the correct
> results?
> (3) I can get codes [zho], [CS], and [GE] calculated using FREQ, but not
> [* tense]. How may I count the occurance of [* tense]. Moreover, can I know
> whether it is the same verb (e.g., say) coded [* tense]?
>
> Thank you very much!
> Happy New Year!
>
> Sincerely,
> Ying
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/chibolts/6de36ec6-5c23-41d7-a0b5-06f5b35a5ce2%40googlegroups.com
> .
> For more options, visit https://groups.google.com/groups/opt_out.
> <sample_English.cha>
>
>
>  --
> You received this message because you are subscribed to the Google Groups
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/chibolts/DBB212D9-F6B5-497A-A88B-2E563A04C7F4%40andrew.cmu.edu
> .
>
> For more options, visit https://groups.google.com/groups/opt_out.
>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CACApr0HG_0o-1Qm90ERZETdASbL%2B%2BexmddgVVJCptyWHr_OCkg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20131231/123f614c/attachment.htm>