different MLUs for different CLAN versions
Brian MacWhinney
macw at cmu.edu
Tue Sep 28 18:32:39 UTC 2010
Dear Bobbi,
Thanks for the clear report on this. The July version of CLAN was using the %mor line, so that cannot be the difference. I think this change is due to the treatment of fillers in MOR. Before July, forms like "um" and "uh" were getting recognized as lexical items on the %mor line, but not the main line. This was essentially a "bug" in the way MLU was working that resulted from the transition from computation from the main line to computation from the %mor line. In order to fix this, I changed all of the "um" forms to "&um". This then makes it so that they don't end up on the %mor line, which is the correct treatment.
-- Brian MacWhinney
On Sep 28, 2010, at 12:45 PM, RCorrigan wrote:
> For homework, I had-my Language Acquisition class compute a simple MLU
> on Adam's speech in adam01. I noticed that they were getting a
> different answer than I did. It turns out that I was using a July
> version of CLAN and they were using a September version.
> My output looked like this:
>
> From file <adam01.cha>
> mlu +t*CHI adam01.cha
> Mon Sep 27 07:14:31 2010
> mlu (02-Jul-2010) is conducting analyses on:
> ONLY speaker main tiers matching: *CHI;
> ****************************************
> From file <adam01.cha>
> MLU for Speaker: *CHI
> MLU (xxx and yyy are EXCLUDED from the utterance and morpheme
> counts):
> Number of: utterances = 1232, morphemes = 2582
> Ratio of morphemes over utterances = 2.096
> Standard deviation = 1.024
>
> Their output looked like this:
> mlu +t*chi adam01.cha
> Tue Sep 28 11:33:56 2010
> mlu (08-Sep-2010) is conducting analyses on:
> ONLY dependent tiers matching: %MOR;
> ****************************************
> From file <adam01.cha>
> MLU for Speaker: *CHI
> MLU (xxx and yyy are EXCLUDED from the utterance and morpheme
> counts):
> Number of: utterances = 1232, morphemes = 2644
> Ratio of morphemes over utterances = 2.146
> Standard deviation = 1.067
>
> I know it's not a big difference, but why should MLUs change from one
> slight revision of CLAN to the next? The best I can figure, the July
> version must not have been calculating MLU on the %MOR line, but all
> the documentation for years has claimed it was.
>
> In addition, the browsable database is using still a different July
> version and is giving a slightly different answers than the other two
> (both the utterance and morpheme counts are different on this one)
>
> mlu +t*chi adam01.cha
>
> Tue Sep 28 12:36:51 2010 mlu (10-Jun-2009) is conducting analyses on:
> ONLY speaker main tiers matching: *CHI;
> ****************************************
> From file "adam01.cha"
> MLU for Speaker: *CHI
> MLU (xxx and yyy are EXCLUDED from the utterance and morpheme counts):
> Number of: utterances = 1236, morphemes = 2582
> Ratio of morphemes over utterances = 2.089
> Standard deviation = 1.030
>
> Thanks for letting me know what is going on.
>
> Bobbi Corrigan
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To post to this group, send email to chibolts at googlegroups.com.
> To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.
>
>
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com.
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com.
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en.
More information about the Chibolts
mailing list