differences in MLU between browser and CLAN
Jennifer Ganger
jennifer.ganger at gmail.com
Tue Nov 18 21:50:00 UTC 2025
Hello again,
Some of my students pointed out today that they are getting different MLU
results when they run it within the browser versus in CLAN. The effect
seems to be widespread--not just one corpus. They noticed discrepancies
with the Tardif corpus at first but then found more.
Taking Eve (Brown corpus) file 020000a in the browser as an example, the
command
mlu +t*CHI 020000a.cha yields:
>From file <childes/Eng-NA/Brown/Eve/020000a.cha> MLU for Speaker: *CHI: MLU
(xxx, yyy and www are EXCLUDED from the utterance and morpheme counts):
Number of: utterances = 424, morphemes = 3687
Ratio of morphemes over utterances = 8.696
Standard deviation = 5.953
That can't be correct.
In downloaded transcripts using CLAN, the same command yields:
>From file <C:\talkbank\clan\Brown\Eve\020000a.cha>
MLU for Speaker: *CHI:
MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme
counts):
Number of: utterances = 424, morphemes = 1468
Ratio of morphemes over utterances = 3.462
Standard deviation = 1.975
Any advice would be appreciated.
Thanks,
Jenny
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/cafe4c39-c9f5-44d6-aae3-3d547b810828n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20251118/66217a18/attachment.htm>
More information about the Chibolts
mailing list