differences in MLU between browser and CLAN

Jennifer Ganger jennifer.ganger at gmail.com
Tue Nov 18 21:50:00 UTC 2025


Hello again,
Some of my students pointed out today that they are getting different MLU 
results when they run it within the browser versus in CLAN. The effect 
seems to be widespread--not just one corpus. They noticed discrepancies 
with the Tardif corpus at first but then found more.

Taking Eve (Brown corpus) file 020000a in the browser as an example, the 
command
mlu +t*CHI  020000a.cha yields:
>From file <childes/Eng-NA/Brown/Eve/020000a.cha> MLU for Speaker: *CHI: MLU 
(xxx, yyy and www are EXCLUDED from the utterance and morpheme counts): 
Number of: utterances = 424, morphemes = 3687 
Ratio of morphemes over utterances = 8.696 
Standard deviation = 5.953

That can't be correct. 

In downloaded transcripts using CLAN, the same command yields:
>From file <C:\talkbank\clan\Brown\Eve\020000a.cha>
MLU for Speaker: *CHI:
  MLU (xxx, yyy and www are EXCLUDED from the utterance and morpheme 
counts):
Number of: utterances = 424, morphemes = 1468
Ratio of morphemes over utterances = 3.462
Standard deviation = 1.975

Any advice would be appreciated.

Thanks,
Jenny


-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/cafe4c39-c9f5-44d6-aae3-3d547b810828n%40googlegroups.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20251118/66217a18/attachment.htm>


More information about the Chibolts mailing list