Frequency of words

Leonid Spektor spektor at andrew.cmu.edu
Thu Sep 14 13:08:38 UTC 2017


Besides the command lines that I gave you in my previous email, the other solution is to use "furry [:: fairy]" coding. Notice two ':' characters instead of one. This code will tell MOR command to put the word "furry" on %mor tier instead of the word "fairy". This solution only works if the actual word spoken by a subject is a real word. Otherwise, you will get "?|..." on %mor tier meaning that the word is not recognized. Also, this solution will not ignore the error word as you seem to want to do.

The question is whether you want to simply ignore the [: target] error word, like word "fairy", or do you want to ignore any erroneously spoken words altogether, like the whole "furry [:: fairy]" structure.

If you want to ignore the whole erroneously spoken word, then you need to use one the command lines below depending on whether you want words from speaker tier or lemmas from %mor tier:

freq -sm** -sm@* +sm;fairy +sm;Cinderella +sm;stair Content_Tester.cha

freq -s"<**>" -s"<: *>" +sfairy +sCinderella +sstairs Content_Tester.cha


Leonid.

> On Sep 14, 2017, at 07:28, Brielle Stark <brielle.stark at gmail.com> wrote:
> 
> Yes, so I suppose what I am asking is whether there is something that can tell the mor command to be computed ignoring the [: target] errors? This is the only way I can think of solving my question.
> Thanks,
> Brie
> 
> On Sep 13, 2017 16:51, "Leonid Spektor" <spektor at andrew.cmu.edu <mailto:spektor at andrew.cmu.edu>> wrote:
> Brie,
> 
> 	The answer depends on whether you are interested in words on speaker tier or lemmas on %mor tier. Your command lines ask for both. I have changed you sample file "Content_Tester.cha" by adding word "fairy" that is not an error or a replacement, so you should get 1 "fairy" count in the output of the following two command lines:
> 
> For "fairy" lemmas, except errors and target replacement, you want:
> 
> freq -sm** -sm@* +sm;fairy Content_Tester.cha
> 
> For "fairy" words on speaker tier, except errors and target replacement, you want:
> 
> freq -s<**> -s<:*> +sfairy Content_Tester.cha
> 
> 
> Leonid.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/ADD234CB-2730-4C0E-929D-7A8BABA3DFD4%40andrew.cmu.edu <https://groups.google.com/d/msgid/chibolts/ADD234CB-2730-4C0E-929D-7A8BABA3DFD4%40andrew.cmu.edu?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
> 
> 
> 
>> On Sep 13, 2017, at 13:49, Brielle Stark <brielle.stark at gmail.com <mailto:brielle.stark at gmail.com>> wrote:
>> 
>> Hello all.
>> 
>> I have a question about calculating word frequency. We're working with aphasia participants who will often make mistakes, and when they do make mistakes, we'll put in the intended word into [: target] if we know what the intention was. However, I do not want to count [: target] words in the frequency tally of words. Basically, if someone said furry [: fairy] in one instance, and I am looking for a frequency count of the correctly spoken 'fairy,' I want the frequency calculation for 'fairy' to be 0, thus ignoring the word in the target. Further, I'd also like to run for lemmas and not morphological changes. In other words, if I'm looking for "stair," I want 'stairs' to be counted in the frequency of 'stair' usage.
>> 
>> Detail:
>> 
>> When I run the command:
>> 
>> freq -sm** -sm@* +sCinderella +sstair +sfairy
>> 
>> on the attached transcript [completely made up, by the way], it evaluates the %mor line but doesn't ignore the target [: target] words like I thought it would. It does do the correct job in tagging 'stair' even though the participant said 'stairs,' a correct usage from the %mor line. Output of frequency for this command was:
>> Cinderella: 1
>> stair: 1
>> fairy: 1
>> 
>> However, as I said, I wouldn't want the incorrect furry [: fairy] to count. So, I tried:
>> 
>> freq -sm** -sm@* +t*PAR +sCinderella +sstair +sfairy
>> 
>> Now that I've told CLAN to stick to the speaker tier, it then ignores 'stair' because 'stairs' was written, which isn't what we were going for. However, it correctly does not look within the [: target] and correctly states that 'fairy' was said 0 times. As an added point, I've also found that when I run the above command on transcripts, it sometimes gets the counts incorrect. For this command, I get the count:
>> Cinderella: 1
>> stair: 0
>> fairy: 0
>> 
>> So basically, is there any way to tell CLAN to run the analysis on the %mor tier for frequencies of words [specifically, lemmas], but somehow to specify to ignore [: target] words on the speaker tier? 
>> 
>> In an ideal world, from the attached transcript, I'd be getting the frequency counts as:
>> Cinderella: 1
>> stair: 1
>> fairy: 0
>> 
>> Thank you very much,
>> 
>> Brie
>> 
>> -- 
>> Brielle Stark, PhD
>> Post Doctoral Fellow in Communication Sciences and Disorders, University of South Carolina
>> t: +1 803-777-9240 <tel:(803)%20777-9240>, alternate email: stark2 at mailbox.sc.edu <mailto:stark2 at mailbox.sc.edu>
>> Aphasia Lab: http://web.asph.sc.edu/aphasia/ <http://web.asph.sc.edu/aphasia/>
>> Center for the Study of Aphasia Recovery: http://web.asph.sc.edu/cstar/ <http://web.asph.sc.edu/cstar/>
>> 
>> -- 
>> You received this message because you are subscribed to the Google Groups "chibolts" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
>> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
>> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAEs2yToSuaOv1de5DWc4CS3h6HR7YEdgUYm0SQ1oxBDC1%2BRcFg%40mail.gmail.com <https://groups.google.com/d/msgid/chibolts/CAEs2yToSuaOv1de5DWc4CS3h6HR7YEdgUYm0SQ1oxBDC1%2BRcFg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
>> 
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/ADD234CB-2730-4C0E-929D-7A8BABA3DFD4%40andrew.cmu.edu <https://groups.google.com/d/msgid/chibolts/ADD234CB-2730-4C0E-929D-7A8BABA3DFD4%40andrew.cmu.edu?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.
> 
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To post to this group, send email to chibolts at googlegroups.com <mailto:chibolts at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAEs2yTotEPoM08YkA27DFP5aP%2BZ9h75CpJrZ3Cy0c%3DH_ax%3Dc3Q%40mail.gmail.com <https://groups.google.com/d/msgid/chibolts/CAEs2yTotEPoM08YkA27DFP5aP%2BZ9h75CpJrZ3Cy0c%3DH_ax%3Dc3Q%40mail.gmail.com?utm_medium=email&utm_source=footer>.
> For more options, visit https://groups.google.com/d/optout <https://groups.google.com/d/optout>.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/F0BB1B4B-827F-4F99-A3A4-CADE2CE807DF%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20170914/96f13a7e/attachment.htm>


More information about the Chibolts mailing list