MLU in characters?
'Janet Bang' via chibolts
chibolts at googlegroups.com
Tue Jul 29 23:10:10 UTC 2025
Much appreciated!
On Tue, Jul 29, 2025 at 3:33 PM Leonid Spektor <spektor at andrew.cmu.edu>
wrote:
> I have changed MLU to count characters. The new options are -bw for
> counting words and -bc for counting characters. Without -b option MLU will
> count morphemes.
>
> New CLAN is on the web.
>
>
> Leonid.
>
> On Jul 29, 2025, at 14:28, Janet Bang <janet.bang at sjsu.edu> wrote:
>
> Hi everyone,
>
> Thank you for your ideas. The thought also crossed our mind to do the
> manual version of inserting a space! We'll also look into WDLEN.
>
> @Leonid Spektor <spektor at andrew.cmu.edu>, yes I think that would work for
> our exploratory use case (comparing types, tokens, and MLU for English and
> Spanish using morphemes, words, and characters). We are still in
> early stages.
>
> Would the +b option consider the same words and utterances that would be
> counted with MLUw? Or would this disregard the MLU rules that are built in?
>
> Janet
>
> On Tue, Jul 29, 2025 at 11:01 AM Leonid Spektor <spektor at andrew.cmu.edu>
> wrote:
>
>> HI,
>>
>> It is easy to add an option to MLU to count characters over utterances.
>> Currently MLU counts words or morphemes over utterances.
>>
>> Just to confirm I understand what you want. I will change +b option to
>> count characters or words. In the case of counting characters each word
>> will be used to count how many characters are in that word and the sum of
>> all characters will be used to count MLU over utterances. Is this what you
>> want?
>>
>> If it is, then I will put new version of CLAN on the web by the end of
>> today.
>>
>>
>> Leonid.
>>
>> On Jul 29, 2025, at 13:24, Nan Bernstein Ratner <nratner at umd.edu> wrote:
>>
>> Couldn't WDLEN do something in this regard? It counts characters...
>>
>> Nan Bernstein Ratner, F-, H-ASHA, F-AAAS, Board Certified Specialist in
>> Stuttering, Cluttering, and Fluency Disorders
>> she/her/hers
>> Distinguished University Professor
>> Hearing and Speech Sciences
>> University of Maryland
>> 0100 Lefrak Hall, 7251 Preinkert Drive
>> College Park, MD 20742
>> nratner at umd.edu, 301-405-4217 My Zoom
>> <https://umd.zoom.us/j/7924324343>
>> Co-director: FluencyBank (www.fluency.talkbank.org);
>> http://languagefluency.umd.edu/
>>
>> Faculty, Language Science (languagescience.umd.edu; Neuroscience &
>> Cognitive Neuroscience (NACS, nacs.umd.edu), Developmental Science Field
>> Committee
>>
>> https://hesp.umd.edu/facultyprofile/bernstein-ratner/nan
>>
>>
>> My PubMed bibliography:
>> https://www.ncbi.nlm.nih.gov/myncbi/1RORcBHUvuRQ82/bibliography/public/
>>
>>
>> On Tue, Jul 29, 2025 at 11:27 AM Shanley <allen at rhrk.uni-kl.de> wrote:
>>
>>> The poor person’s workaround - you could tweak the system by just making
>>> each character into a word - i.e. by putting a space between every
>>> character on whatever tier you’re using to count MLU. Surely a python
>>> script could easily do this for you.
>>>
>>> Or a more complicated variant would be to write a python script to
>>> calculate what you want from the existing file.
>>>
>>> In both cases, you should of course take Leonid’s observation below into
>>> account - that you’d need to first decide which words/utterances should be
>>> included.
>>>
>>> Best,
>>> Shanley Allen.
>>>
>>>
>>>
>>> On 25. Jul 2025, at 15:15, 'Janet Bang' via chibolts <
>>> chibolts at googlegroups.com> wrote:
>>>
>>> Got it, thank you!
>>>
>>> On Fri, Jul 25, 2025 at 12:12 PM Leonid Spektor <spektor at andrew.cmu.edu>
>>> wrote:
>>>
>>>> Janet,
>>>>
>>>> I am sorry to say it, but MLU can only count words or morphemes.
>>>>
>>>> If you plan to use another program, then please keep in mind that MLU
>>>> uses a lot of rules to decide if utterance or word(s) should be counted.
>>>> You can read those rule in CLAN manual at
>>>> https://talkbank.org/0info/manuals/CLAN.pdf. Please look for chapter
>>>> "7.19" MLU in the manual.
>>>>
>>>>
>>>> Leonid.
>>>>
>>>> On Jul 25, 2025, at 14:53, 'Janet Bang' via chibolts <
>>>> chibolts at googlegroups.com> wrote:
>>>>
>>>> Hello!
>>>>
>>>> Is there a way to use the MLU program to extract MLU in characters? We
>>>> are exploring measures to facilitate cross-linguistic comparisons between
>>>> English and Spanish and someone had recommended using characters (over MLU
>>>> words) given the orthographic transparency of Spanish.
>>>>
>>>> We saw some other programs on github, but I was hoping there was
>>>> something within CLAN because we had already used MOR within CLAN.
>>>>
>>>> Thanks,
>>>> Janet
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "chibolts" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to chibolts+unsubscribe at googlegroups.com.
>>>> To view this discussion visit
>>>> https://groups.google.com/d/msgid/chibolts/9b2ae135-2fdb-4b55-b9f2-06886ace8217n%40googlegroups.com
>>>> <https://groups.google.com/d/msgid/chibolts/9b2ae135-2fdb-4b55-b9f2-06886ace8217n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "chibolts" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to chibolts+unsubscribe at googlegroups.com.
>>>> To view this discussion visit
>>>> https://groups.google.com/d/msgid/chibolts/E954F36D-0B93-4B5C-8C05-7C37BA062E75%40andrew.cmu.edu
>>>> <https://groups.google.com/d/msgid/chibolts/E954F36D-0B93-4B5C-8C05-7C37BA062E75%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> --
>>> Janet Y. Bang, Ph.D (she/her/hers)
>>> Assistant Professor
>>> Child and Adolescent Development
>>> Lurie College of Education, San José State University
>>> janet.bang at sjsu.edu | 408-924-3714
>>> https://www.sjsu.edu/education/faculty/janet-bang.php
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "chibolts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to chibolts+unsubscribe at googlegroups.com.
>>> To view this discussion visit
>>> https://groups.google.com/d/msgid/chibolts/CAL7GuZrwdWQbgt5381CSSTCUyPA6drFzpycbzZDopi8VOojQKA%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/chibolts/CAL7GuZrwdWQbgt5381CSSTCUyPA6drFzpycbzZDopi8VOojQKA%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>>
>>>
>>> ********************************************************************************
>>> Prof. Dr. Shanley E. M. Allen
>>> Director, Psycholinguistics and Language Development Group
>>> Center for Cognitive Science
>>> University of Kaiserslautern-Landau
>>> Erwin-Schrödinger-Straße 57/409
>>> 67663 Kaiserslautern
>>> Germany
>>>
>>> e-mail: allen at rptu.de
>>> phone: +49-631-205-4136
>>> fax: +49-631-205-5182
>>> office: Building 57, Office 409
>>> web: http://www.sowi.uni-kl.de/psycholinguistics/home/
>>>
>>> ********************************************************************************
>>>
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "chibolts" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to chibolts+unsubscribe at googlegroups.com.
>>> To view this discussion visit
>>> https://groups.google.com/d/msgid/chibolts/FA203199-03A0-4407-A433-2F95ED5E5FAC%40rhrk.uni-kl.de
>>> <https://groups.google.com/d/msgid/chibolts/FA203199-03A0-4407-A433-2F95ED5E5FAC%40rhrk.uni-kl.de?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to chibolts+unsubscribe at googlegroups.com.
>> To view this discussion visit
>> https://groups.google.com/d/msgid/chibolts/CAAFocx4Y04XU467TLai5U2Rhjuq2WpOkcZEQUtUXe%2BAJuRu94Q%40mail.gmail.com
>> <https://groups.google.com/d/msgid/chibolts/CAAFocx4Y04XU467TLai5U2Rhjuq2WpOkcZEQUtUXe%2BAJuRu94Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to chibolts+unsubscribe at googlegroups.com.
>> To view this discussion visit
>> https://groups.google.com/d/msgid/chibolts/6DFC226B-18F8-4E34-A667-60D7EF79310C%40andrew.cmu.edu
>> <https://groups.google.com/d/msgid/chibolts/6DFC226B-18F8-4E34-A667-60D7EF79310C%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> --
> Janet Y. Bang, Ph.D (she/her/hers)
> Assistant Professor
> Child and Adolescent Development
> Lurie College of Education, San José State University
> janet.bang at sjsu.edu | 408-924-3714
> https://www.sjsu.edu/education/faculty/janet-bang.php
>
>
>
--
Janet Y. Bang, Ph.D (she/her/hers)
Assistant Professor
Child and Adolescent Development
Lurie College of Education, San José State University
janet.bang at sjsu.edu | 408-924-3714
https://www.sjsu.edu/education/faculty/janet-bang.php
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/CAL7GuZo-HM4eOGkvuWoHDZrAFy90NUeCdwhqhm6oq9SqhVxeQA%40mail.gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20250729/091bf395/attachment-0001.htm>
More information about the Chibolts
mailing list