Using the gem and freq commands on bilingual data to get types and tokens by language for different activities

Leonid Spektor spektor at andrew.cmu.edu
Tue Feb 25 03:08:03 UTC 2020


Sarah,

	English language code is three letters as all other language codes are. For English the code is "eng".


Leonid.

> On Feb 24, 2020, at 20:18, Sarah Surrain <sarahsurrain at gmail.com> wrote:
> 
> Hello,
> 
> I am working with Spanish-English bilingual data from parent-child dyads. In the header, I have specified the languages as 
> @Languages:    spa, en
> and I have used [- eng] precodes for English utterances and @s tags on English words embedded in Spanish utterances.
> 
> I would like to use gem markers to segment the transcripts by activity (such as book reading) and then run a freq command to count the number of tokens in Spanish and English used by the parent and child during that activity.
> 
> I used this command to retrieve only the book reading activities and create new CHAT files with headers:
> gem +sbook +d1 +f *.cha
> 
> Then I tried these commands on the output to create an Excel file with the types, tokens, TTR and MATTR for each language:
> freq +l +s*@s:eng +d3 +b10 *.cha
> freq +l +s*@s:spa +d3 +b10 *.cha
> 
> However, I am getting these errors: 
> Language "eng" is not defined on "@Languages:" header tier.
> and
> Illegal use of "@s", no alternative language in position 1 defined on @Language: tier.
> 
> I can fix this by manually pasting the @Languages line into the header in the new file that I created using the gem command. Is there a way to automatically create CHAT files using the gem command that retain the @Languages line?
> 
> (I also tried the gemfreq command (gemfreq +sbook +l +s*@s:eng +d3 +b10 *.cha) but I wasn't able to create an excel worksheet with the types, tokens, etc for each participant. I got the error: The only +d levels allowed are 0–1).
> 
> Thank you!
> 
> Sarah Surrain 
> 
> Sarah Surrain, Ed.M.
> Ph.D. Candidate
> Harvard University FAS | GSE
> https://scholar.harvard.edu/sarahsurrain <https://scholar.harvard.edu/sarahsurrain>
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/11b1fe6c-0bf5-421f-98b1-2e76aeaa71bd%40googlegroups.com <https://groups.google.com/d/msgid/chibolts/11b1fe6c-0bf5-421f-98b1-2e76aeaa71bd%40googlegroups.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/452CE2B1-70F9-4E46-B3A2-2C3F1726531F%40andrew.cmu.edu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20200224/fd82172e/attachment.htm>


More information about the Chibolts mailing list