From daniellekaryllehu at gmail.com Tue Apr 8 00:21:37 2025 From: daniellekaryllehu at gmail.com (Danielle Hu) Date: Mon, 7 Apr 2025 19:21:37 -0500 Subject: Use of FREQ in Bilingual Corpora Message-ID: Hi all, I'm doing some CLAN analyses for my Master's thesis, and maybe I'm missing it in the manual trying to teach myself how to do this, but I'm struggling to code the analyses I want. For context, I am trying to count the number of parent/caregiver utterances in a transcript that are in each language, and of those utterances, which are related to a postcode that says whether the utterance was related to directly reading from the story provided, or extratextual speech (e.g., questioning, commenting about the story, etc.). I am analyzing this to determine whether changing the order of language presentation in a bilingual book impacts caregiver language use in Tagalog and English. I've processed the following command to get a frequency of the number of caregiver utterances that are marked as English and Tagalog: *freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.cha* I want to do the following things: - Exclude utterances that have code switching. I have attempted *freq +l1 +t*CAR +s"<- eng> -s"@s:tgl" +d2 *.cha*, but this does not work. It outputs the same number of English utterances as the first command. - Identify utterances that have both a precode of [- eng] and a postcode of [+ b] (book) or [+ e] (extra-textual), and vice versa for [- tgl] - Of code-switched utterances, which ones are related to the postcode of [+ b] or [+ e] I hope that's a clear explanation of what I'm looking for! If CLAN doesn't have a way to do this, that's also okay, but wanted to check with everyone to make sure that it wasn't a possibility before I start manually counting things. Luckily, it's only for 6 participants. Thanks in advance! All the best, Danielle Hu -- You received this message because you are subscribed to the Google Groups "chibolts" group. To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/chibolts/CAEGyqUg21cgc-1K41R5hu%3D%2BNXjjCOCxa2A8c34cwRc6a_h8wwQ%40mail.gmail.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From spektor at andrew.cmu.edu Tue Apr 8 03:12:06 2025 From: spektor at andrew.cmu.edu (Leonid Spektor) Date: Mon, 7 Apr 2025 23:12:06 -0400 Subject: Use of FREQ in Bilingual Corpora In-Reply-To: References: Message-ID: Hi Danielle, 1. To Exclude utterances that have code switching you need to do it in two passes. First you need to exclude the utterances that have words with @s:tgl. Following KWAL command assumes that you did not code words with @s:tgl on [- tgl] utterances: kwal +o@ +o% +d -s*@s:tgl* +f *.cha After that you can run your FREQ command: freq +l1 +t*CAR +s"<- eng> +d2 *.kwal.cex You can take care of both @s:tgl and @s:eng code switching with one KWAL command. As above KWAL command, this command also assumes that you did not code words with @s:eng on [- eng] utterances: kwal +o@ +o% +d -s*@s:tgl* -s*@s:eng* +f *.cha and next run command: freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.kwal.cex 2. To Identify utterances that have both a precode of [- eng] and a post-code of [+ b] again you can do it with two passes. kwal -d +o@ +o% +s"[+ b]" +f *.cha kwal -d +l1 +s"[- eng]" *.kwal.cex OR one COMBO command: combo +o@ +o% -d +l1 +s"[- eng]^*^[+ b]" +d +f *.cha 3. Of code-switched utterances, which ones are related to the post-code of [+ b]. First pass KWAL command: kwal +o@ +o% +d +s*@s:tgl* +s*@s:eng* +s"[+ b]" +f *.cha Next FREQ command: freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.kwal.cex Leonid. > On Apr 7, 2025, at 20:21, Danielle Hu wrote: > > Hi all, > > I'm doing some CLAN analyses for my Master's thesis, and maybe I'm missing it in the manual trying to teach myself how to do this, but I'm struggling to code the analyses I want. For context, I am trying to count the number of parent/caregiver utterances in a transcript that are in each language, and of those utterances, which are related to a postcode that says whether the utterance was related to directly reading from the story provided, or extra textual speech (e.g., questioning, commenting about the story, etc.). I am analyzing this to determine whether changing the order of language presentation in a bilingual book impacts caregiver language use in Tagalog and English. > > I've processed the following command to get a frequency of the number of caregiver utterances that are marked as English and Tagalog: freq +l1 +t*CAR +s"<- eng>" +s"<- tgl>" +d2 *.cha > > I want to do the following things: > Exclude utterances that have code switching. I have attempted freq +l1 +t*CAR +s"<- eng> -s"@s:tgl" +d2 *.cha, but this does not work. It outputs the same number of English utterances as the first command. > Identify utterances that have both a precode of [- eng] and a postcode of [+ b] (book) or [+ e] (extra-textual), and vice versa for [- tgl] > Of code-switched utterances, which ones are related to the postcode of [+ b] or [+ e] > I hope that's a clear explanation of what I'm looking for! If CLAN doesn't have a way to do this, that's also okay, but wanted to check with everyone to make sure that it wasn't a possibility before I start manually counting things. Luckily, it's only for 6 participants. Thanks in advance! > > All the best, > Danielle Hu > > > -- > You received this message because you are subscribed to the Google Groups "chibolts" group. > To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com . > To view this discussion visit https://groups.google.com/d/msgid/chibolts/CAEGyqUg21cgc-1K41R5hu%3D%2BNXjjCOCxa2A8c34cwRc6a_h8wwQ%40mail.gmail.com . -- You received this message because you are subscribed to the Google Groups "chibolts" group. To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/chibolts/F810BDE8-9389-46E4-BAB2-E93738DA9CEA%40andrew.cmu.edu. -------------- next part -------------- An HTML attachment was scrubbed... URL: