Frequency of words

Wed Sep 13 17:49:34 UTC 2017

Hello all.

I have a question about calculating word frequency. We're working with
aphasia participants who will often make mistakes, and when they do make
mistakes, we'll put in the intended word into [: target] if we know what
the intention was. However, I do not want to count [: target] words in the
frequency tally of words. Basically, if someone said furry [: fairy] in one
instance, and I am looking for a frequency count of the correctly spoken
'fairy,' I want the frequency calculation for 'fairy' to be 0, thus
ignoring the word in the target. Further, I'd also like to run for lemmas
and not morphological changes. In other words, if I'm looking for "stair,"
I want 'stairs' to be counted in the frequency of 'stair' usage.

*Detail:*

When I run the command:

*freq **-sm** -sm@* **+sCinderella +sstair +sfairy*

on the attached transcript [completely made up, by the way], it evaluates
the %mor line but doesn't ignore the target [: target] words like I thought
it would. It does do the correct job in tagging 'stair' even though the
participant said 'stairs,' a correct usage from the %mor line. Output of
frequency for this command was:
Cinderella: 1
stair: 1
fairy: 1

However, as I said, I wouldn't want the incorrect furry [: fairy] to count.
So, I tried:

*freq -sm** -sm@* +t*PAR +sCinderella +sstair +sfairy*

Now that I've told CLAN to stick to the speaker tier, it then ignores
'stair' because 'stairs' was written, which isn't what we were going for.
However, it correctly does not look within the [: target] and correctly
states that 'fairy' was said 0 times. As an added point, I've also found
that when I run the above command on transcripts, it sometimes gets the
counts incorrect. For this command, I get the count:
Cinderella: 1
stair: 0
fairy: 0

So basically, is there any way to tell CLAN to run the analysis on the %mor
tier for frequencies of words [specifically, lemmas], but somehow to
specify to ignore [: target] words on the speaker tier?

In an ideal world, from the attached transcript, I'd be getting the
frequency counts as:
Cinderella: 1
stair: 1
fairy: 0

Thank you very much,

Brie

-- 
*Brielle Stark, PhD*
Post Doctoral Fellow in Communication Sciences and Disorders, University of
South Carolina
t: +1 803-777-9240, alternate email: stark2 at mailbox.sc.edu
Aphasia Lab: http://web.asph.sc.edu/aphasia/
Center for the Study of Aphasia Recovery: http://web.asph.sc.edu/cstar/

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAEs2yToSuaOv1de5DWc4CS3h6HR7YEdgUYm0SQ1oxBDC1%2BRcFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20170913/954f7fba/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Content_Tester.cha
Type: application/octet-stream
Size: 474 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20170913/954f7fba/attachment-0002.obj>