Exclude marked text when getting MATTR from MOR

Amanda Huensch amandahuensch at gmail.com
Wed Dec 7 20:52:15 UTC 2022


Leonid and Brian,

Thank you very much! Because we exclude both partial and whole utterances,
the [e] will work best for us. I was able to batch update [% g] to [e]
easily with chstring +w +cCODEchange.cut, and now the MATTR results only
include the language we’re interested in.

Much appreciated!!

Amanda

On Wed, Dec 7, 2022 at 1:44 PM Leonid Spektor <spektor at andrew.cmu.edu>
wrote:

> Amanda,
>
> One more suggestion is if you always want to exclude only whole utterances
> from MATTR computation, then Brians suggestion of adding [+ exc] post-code
> to end of utterance will do the job. For MATTR computation you would use
> -s"[+ exc]" option to tell FREQ to exclude all utterances with "[+ exc]"
> post-code. This way you need to have just one copy of the data. If you want
> to exclude just specific word(s) on utterances from MATTR, then "[+ exc]"
> post-code will not work.
>
>
> Leonid.
>
> On Dec 7, 2022, at 13:11, Leonid Spektor <spektor at andrew.cmu.edu> wrote:
>
> Amanda,
>
> The < > [% g] does not effect the creation of %mor tier. I would suggest
> you replace it with  < > [e]. This will prevent words surrounded by [e]
> from being placed on %mor tier and your FREQ MATTR command will work the
> way you want. Of course, this will create other problems for other analyses
> if words within [e] need to be analyzed by MLU or KIDEVAL or other
> commands. You might need to have two copies of your data. One for
> MATTR analyses and other data without [e] for other analyses.
>
>
> Leonid.
>
> On Dec 7, 2022, at 10:32, Amanda Huensch <amandahuensch at gmail.com> wrote:
>
> Hello,
>
> I am attempting to get MATTR values from the MOR line of transcripts in
> which we have coded speech to be ignored using < > [% g] as in the
> following:
>
> *151:     <vale> [% g] .
>
> *151:     esta es un [//] una historia acerca de dos hermanos, Gustavo y
>
>                 Jorge [^c] .
>
> *151:     <&um Jorge es el hermano mayor &eh quien se traslado a otra
> ciudad
>
>                 en el año dos mil porque empezó su carrera universitaria>
> [% g] .
>
> *151:     &ehm cuando salió Jorge [^c] Gustavo se sentía muy solo [^c]
> porque
>
>                 antes ju(gaba) [/] jugaba siempre con Jorge [^c] .
>
>
> I can use this command freq @ +t*1* +t%mor +b10 +sm;*,o% -sm|neo +d3 which
> outputs MATTR but realized it includes the < > [% g] coded text.
>
> I tried using the switch -s"<% g>" which works with a simple FREQ command
> as follows but received the same Type/Token/MATTR values as when I ran the
> above command.
>
> freq @ +t*1* -s"<% g>" +t%mor +b10 +sm;*,o% -sm|neo +d3
>
> I also tried using the -s"<% g>" switch during the MOR step (mor -s"<%
> g>"+t*1* @) but received a message to only use language codes with the -s
> option.
>
> Is there a way to ignore the < > [% g] coded text when running MOR? Or if
> not, is there a way to ignore the < > [% g] coded text when calculating
> MATTR with the FREQ command?
>
> Thank you for your help!
>
> Amanda
>
> --
> You received this message because you are subscribed to the Google Groups
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to chibolts+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/chibolts/969d97ab-b817-4228-852c-1e3906a123f4n%40googlegroups.com
> <https://groups.google.com/d/msgid/chibolts/969d97ab-b817-4228-852c-1e3906a123f4n%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to chibolts+unsubscribe at googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/chibolts/17A20A92-41C4-483C-BC3A-F714564DE6F1%40andrew.cmu.edu
> <https://groups.google.com/d/msgid/chibolts/17A20A92-41C4-483C-BC3A-F714564DE6F1%40andrew.cmu.edu?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/CAJhUt7Zrtmj-2p4iCvaEX7xyD%2BUfYUF3z4kcazD%3D%3D_r4eABqGw%40mail.gmail.com.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20221207/c2c3e52c/attachment-0001.htm>


More information about the Chibolts mailing list