<div dir="ltr">Dear Brian,<br><br>But I'd like to get separate counts for English and Chinese words. Let me rephrase my question. In a predominantly English transcript, I'd like to get a count of ALL English words, including the ones embedded in [- zho] lines marked with @s. I can now achieve this by running two separate commands (TCH is teacher):<br><br>freq +tTCH -s"[- zho-yue]" -s"[- zho]" *.cha<br>freq +tTCH +s*@s* *.cha<br><br>There are two issues with the 2-command solution:<br>1. I get two sets of counts that need to be summed manually.<br>2. The same word with and without @s are counted as two types.<br><br>I was wondering if there's a way to combine these two commands and resolve these two issues (or at least one). Below is an excerpt of my transcript (TCH is teacher) (for IRB reasons, I cannot provide the whole transcript):<br><br>*TCH:    you can take this heart .<br>*TCH:    [- zho] this@s$n 星 .<br><br>Thank you so much for your patience and kindness.<br><br>Lulu<br><br>On Wednesday, June 29, 2016 at 2:41:46 PM UTC-4, Brian MacWhinney wrote:<blockquote class="gmail_quote" style="margin: 0;margin-left: 0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">







<div bgcolor="white" link="blue" vlink="purple" lang="EN-US">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Dear Lulu,</span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">I think you want +s:[- zho]” in this case, not –s”[- zho]”  When I run</span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">freq +s"[- yue]" +t*CHI *.cha +u</span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">on CharlotteEng, I get both the English words marked as @s and the Cantonese. 
</span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">--Brian</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From: </span>
</b><span style="font-family:Calibri;color:black">ChiBolts <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="yVGVG3oOAwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chib...@googlegroups.com</a>> on behalf of Lulu <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="yVGVG3oOAwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">lulu...@gmail.com</a>><br>
<b>Reply-To: </b>ChiBolts <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="yVGVG3oOAwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chib...@googlegroups.com</a>><br>
<b>Date: </b>Tuesday, June 28, 2016 at 10:34 PM<br>
<b>To: </b>ChiBolts <<a href="javascript:" target="_blank" gdf-obfuscated-mailto="yVGVG3oOAwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chib...@googlegroups.com</a>><br>
<b>Subject: </b>Re: Running FREQ for bilingual transcripts</span></p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<blockquote style="border:none;border-left:solid #b5c4df 4.5pt;padding:0in 0in 0in 4.0pt;margin-left:3.75pt;margin-right:0in">
<div>
<div>
<div>
<p class="MsoNormal">Hi Brian,<br>
<br>
I tried to run the reverse command on the same transcript (mostly English with a dozen words in Chinese)<br>
<br>
freq +tTCH -s"[- zho]" +s”*@s” *.cha (I added * after @s because my transcript also tags if the @s word is a noun or a verb)<br>
<br>
hoping to add the few @s English words embedded in [- zho] lines to the English word counts, but only got 0's. With +s"*@s*" removed, I get good results which don't include the @s English words. Not sure how I can fix this.<br>
<br>
Thanks!<br>
<br>
Lulu<br>
<br>
On Tuesday, June 28, 2016 at 10:22:40 PM UTC-4, Lulu wrote: </p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<p class="MsoNormal">Dear Brian,<br>
<br>
That just did magic! Thank you so much!<br>
<br>
Best,<br>
Lulu<br>
<br>
On Tuesday, June 28, 2016 at 10:15:26 PM UTC-4, Brian MacWhinney wrote: </p>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Dear Lulu,</span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">Without seeing your transcripts, I can’t say exactly what is wrong.  However, if you run this similar command on the CharlotteEng folder in the YipMatthews corpus, you get good results:</span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">freq +t*CHI +s"[- yue]" *.cha</span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">The idea is that this will include all words on the [- yue] lines including those with @s, although the latter are pretty rare.  If you want to exclude those, just add –s”*@s”</span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<p class="MsoNormal" style="text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">-- Brian MacWhinney</span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"> </span></p>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From:
</span></b><span style="font-family:Calibri;color:black">ChiBolts <<a>chib...@googlegroups.com</a>> on behalf of Lulu <<a>lulu...@gmail.com</a>><br>
<b>Reply-To: </b>ChiBolts <<a>chib...@googlegroups.com</a>><br>
<b>Date: </b>Tuesday, June 28, 2016 at 5:10 PM<br>
<b>To: </b>ChiBolts <<a>chib...@googlegroups.com</a>><br>
<b>Subject: </b>Running FREQ for bilingual transcripts</span></p>
</div>
<div>
<p class="MsoNormal"> </p>
</div>
<div>
<div>
<div>
<p class="MsoNormal">Hi Brian and team members,<br>
<br>
I ran the freq command<br>
<br>
freq +tTCH +s"[- zho]" *.cha<br>
<br>
for transcripts that contain bilingual utterances (e.g., *TCH:    [- zho] this@s$n
<span style="font-family:"MS Mincho"" lang="ZH-CN">星</span>). The dominant language of the transcripts was English so we marked utterances that contained Chinese with [- zho]. The output types and tokens included all the English words that were marked @s. I
 thought I would get the types and tokens of all the Chinese words by running the above command. Is the problem with the transcript or the command?<br>
<br>
Thank you!<br>
<br>
Lulu</p>
</div>
<p class="MsoNormal">--
<br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to <a>chibolts+u...@googlegroups.com</a><wbr>.<br>
To post to this group, send email to <a>chib...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/0e5d867c-79b1-4d36-87be-1303f390a83b%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/chibolts/0e5d867c-79b1-4d36-87be-1303f390a83b%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter';return true;" onclick="this.href='https://groups.google.com/d/msgid/chibolts/0e5d867c-79b1-4d36-87be-1303f390a83b%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter';return true;">
https://groups.google.com/d/<wbr>msgid/chibolts/0e5d867c-79b1-<wbr>4d36-87be-1303f390a83b%<wbr>40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">
https://groups.google.com/d/<wbr>optout</a>.</p>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<p class="MsoNormal">-- <br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a href="javascript:" target="_blank" gdf-obfuscated-mailto="yVGVG3oOAwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chibolts+u...@<wbr>googlegroups.com</a>.<br>
To post to this group, send email to <a href="javascript:" target="_blank" gdf-obfuscated-mailto="yVGVG3oOAwAJ" rel="nofollow" onmousedown="this.href='javascript:';return true;" onclick="this.href='javascript:';return true;">chib...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/56bb5824-2983-4e64-9111-42841037333f%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/msgid/chibolts/56bb5824-2983-4e64-9111-42841037333f%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter';return true;" onclick="this.href='https://groups.google.com/d/msgid/chibolts/56bb5824-2983-4e64-9111-42841037333f%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter';return true;">
https://groups.google.com/d/<wbr>msgid/chibolts/56bb5824-2983-<wbr>4e64-9111-42841037333f%<wbr>40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank" rel="nofollow" onmousedown="this.href='https://groups.google.com/d/optout';return true;" onclick="this.href='https://groups.google.com/d/optout';return true;">https://groups.google.com/d/<wbr>optout</a>.</p>
</div>
</div>
</blockquote>
</div>
</div>

</blockquote></div>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups "chibolts" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/948766f2-00d7-4ce1-a775-af37ae759393%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/948766f2-00d7-4ce1-a775-af37ae759393%40googlegroups.com</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />