Excluding non-words from TTR/Vocab Size counts?

Brian MacWhinney macw at cmu.edu
Tue Jun 25 17:51:01 UTC 2013


Dear Megan,

     Calculation from the main line cannot automatically know what is an is not a non-word.  However, if you are calculating from the %mor line, then FREQ should be making the right choices.  This is because a lot of these decisions are encoded inside the analysis that MOR does to create a %mor line.  Those decisions can be modified in many ways.  For example, you might wish to exclude all words with the "co" part of speech.  But other researchers may have other criteria, so you need to think through exactly why you are excluding particular things.  A lot will also depend on how transcription was done.
    If you don't have a %mor line in your transcript, you can simply list all the words you think should be excluded in a file to be used with the -s switch.
    By the way, have you considered using VOCD instead of TTR.  It is better justified statistically.

--Brian MacWhinney

On Jun 25, 2013, at 11:25 AM, M Fields <megan.a.fields at gmail.com> wrote:

> I have been trying to calculate vocabulary size and TTR for my analyses. In the process, I realized that CLAN is counting non-words as words- (e.g. xx, ah, de, wha, mah) in the calculation of vocab size and word types. I was under the impression that CLAN would only count English words in these analyses. Is there an additional/different code that needs to be used?
>  
> Thank you
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/ffc441b6-27d5-4a1a-b28c-9b8038778d98%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/FDBDA76A-C59F-4DF4-890B-ECAC73C17A34%40cmu.edu.
For more options, visit https://groups.google.com/groups/opt_out.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20130625/746b39b7/attachment.htm>


More information about the Chibolts mailing list