KWAL, MLU

Mon Feb 16 10:11:26 UTC 2004

>===== Original Message From "Brian MacWhinney" <macw at cmu.edu> =====
>Dear Emily,
>  Sorry about the delay in adding further comment on the issue of
>analyzing files with a certain number of turns.  The short answer is
>that KWAL pulls out turns without exclusions, whereas MLU applies
>additional exclusionary criteria.  So the results are usually different.
> But, then, KWAL and MLU have different purposes.  It might be possible
>to modify the programs to achieve some of your purposes.  However, right
>now, I am not clear enough about what you are trying to do.  Unless,
>there is some particular reason to worry about MLU, why can't you just
>use KWAL's method for pulling out a specified range of utterances?
>
>--Brian MacWhinney

Dear Brian,

I have counted the numbers of code-mixing of the subject of each file because
I think code-mixng can give me ideas how the bilingual subject's languages
would be like. However, since the file length varies between different
transcripts, the numbers of code-mixing of all the files are not comparable so
they cannot be fit into statistical calculation. Therefore, I would like to
standardize the length of all the files in terms of utterances by selecting
the shortest file, which contains the smallest total number of utterances of
the subject, as the baseline to shorten other files which are longer than
that. Because of this, I need MLU to give me the total number of utterances
while KWAL to pull out a specified range of utterances. However, as what you
told me, they generate different number of utterances.

In fact, I would like to know how to generate 'upper bound' in CLAN program.
Thank you for your attention.

Best regards,
Emily