Use of FREQ in Bilingual Corpora
Leonid Spektor
spektor at andrew.cmu.edu
Tue Apr 8 03:12:06 UTC 2025
Hi Danielle,
1. To Exclude utterances that have code switching you need to do it in two passes. First you need to exclude the utterances that have words with @s:tgl. Following KWAL command assumes that you did not code words with @s:tgl on [- tgl] utterances:
kwal +o@ +o% +d -s*@s:tgl* +f *.cha
After that you can run your FREQ command:
freq +l1 +t*CAR +s"<- eng> +d2 *.kwal.cex
You can take care of both @s:tgl and @s:eng code switching with one KWAL command. As above KWAL command, this command also assumes that you did not code words with @s:eng on [- eng] utterances:
kwal +o@ +o% +d -s*@s:tgl* -s*@s:eng* +f *.cha
and next run command:
freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.kwal.cex
2. To Identify utterances that have both a precode of [- eng] and a post-code of [+ b] again you can do it with two passes.
kwal -d +o@ +o% +s"[+ b]" +f *.cha
kwal -d +l1 +s"[- eng]" *.kwal.cex
OR one COMBO command:
combo +o@ +o% -d +l1 +s"[- eng]^*^[+ b]" +d +f *.cha
3. Of code-switched utterances, which ones are related to the post-code of [+ b]. First pass KWAL command:
kwal +o@ +o% +d +s*@s:tgl* +s*@s:eng* +s"[+ b]" +f *.cha
Next FREQ command:
freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.kwal.cex
Leonid.
> On Apr 7, 2025, at 20:21, Danielle Hu <daniellekaryllehu at gmail.com> wrote:
>
> Hi all,
>
> I'm doing some CLAN analyses for my Master's thesis, and maybe I'm missing it in the manual trying to teach myself how to do this, but I'm struggling to code the analyses I want. For context, I am trying to count the number of parent/caregiver utterances in a transcript that are in each language, and of those utterances, which are related to a postcode that says whether the utterance was related to directly reading from the story provided, or extra textual speech (e.g., questioning, commenting about the story, etc.). I am analyzing this to determine whether changing the order of language presentation in a bilingual book impacts caregiver language use in Tagalog and English.
>
> I've processed the following command to get a frequency of the number of caregiver utterances that are marked as English and Tagalog: freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.cha
>
> I want to do the following things:
> Exclude utterances that have code switching. I have attempted freq +l1 +t*CAR +s"<- eng> -s"@s:tgl" +d2 *.cha, but this does not work. It outputs the same number of English utterances as the first command.
> Identify utterances that have both a precode of [- eng] and a postcode of [+ b] (book) or [+ e] (extra-textual), and vice versa for [- tgl]
> Of code-switched utterances, which ones are related to the postcode of [+ b] or [+ e]
> I hope that's a clear explanation of what I'm looking for! If CLAN doesn't have a way to do this, that's also okay, but wanted to check with everyone to make sure that it wasn't a possibility before I start manually counting things. Luckily, it's only for 6 participants. Thanks in advance!
>
> All the best,
> Danielle Hu
>
>
> --
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com <mailto:chibolts+unsubscribe at googlegroups.com>.
> To view this discussion visit https://groups.google.com/d/msgid/chibolts/CAEGyqUg21cgc-1K41R5hu%3D%2BNXjjCOCxa2A8c34cwRc6a_h8wwQ%40mail.gmail.com <https://groups.google.com/d/msgid/chibolts/CAEGyqUg21cgc-1K41R5hu%3D%2BNXjjCOCxa2A8c34cwRc6a_h8wwQ%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/chibolts/F810BDE8-9389-46E4-BAB2-E93738DA9CEA%40andrew.cmu.edu.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20250407/fb638608/attachment-0001.htm>
More information about the Chibolts
mailing list