<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Title" content="">
<meta name="Keywords" content="">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:DengXian;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"MS Mincho";
        panose-1:2 2 6 9 4 2 5 8 3 4;}
@font-face
        {font-family:MingLiU;
        panose-1:2 2 5 9 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman";}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:Calibri;
        color:windowtext;}
span.msoIns
        {mso-style-type:export-only;
        mso-style-name:"";
        text-decoration:underline;
        color:teal;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style>
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Dear Lulu,<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">In that case, you can add the +l switch which marks each word overtly for its language.  The command that would work for CharlotteEng is:<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">freq +l  +s*@s:eng *.cha +u +f +t*CHI<o:p></o:p></span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal" style="text-indent:9.0pt"><span style="font-size:11.0pt;font-family:Calibri">--Brian<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From: </span>
</b><span style="font-family:Calibri;color:black">ChiBolts <chibolts@googlegroups.com> on behalf of Lulu <lulusong@gmail.com><br>
<b>Reply-To: </b>ChiBolts <chibolts@googlegroups.com><br>
<b>Date: </b>Wednesday, June 29, 2016 at 4:56 PM<br>
<b>To: </b>ChiBolts <chibolts@googlegroups.com><br>
<b>Subject: </b>Re: Running FREQ for bilingual transcripts<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<blockquote style="border:none;border-left:solid #B5C4DF 4.5pt;padding:0in 0in 0in 4.0pt;margin-left:3.75pt;margin-right:0in" id="MAC_OUTLOOK_ATTRIBUTION_BLOCKQUOTE">
<div>
<div>
<div>
<p class="MsoNormal">Dear Brian,<br>
<br>
But I'd like to get separate counts for English and Chinese words. Let me rephrase my question. In a predominantly English transcript, I'd like to get a count of ALL English words, including the ones embedded in [- zho] lines marked with @s. I can now achieve
 this by running two separate commands (TCH is teacher):<br>
<br>
freq +tTCH -s"[- zho-yue]" -s"[- zho]" *.cha<br>
freq +tTCH +s*@s* *.cha<br>
<br>
There are two issues with the 2-command solution:<br>
1. I get two sets of counts that need to be summed manually.<br>
2. The same word with and without @s are counted as two types.<br>
<br>
I was wondering if there's a way to combine these two commands and resolve these two issues (or at least one). Below is an excerpt of my transcript (TCH is teacher) (for IRB reasons, I cannot provide the whole transcript):<br>
<br>
*TCH:    you can take this heart .<br>
*TCH:    [- zho] this@s$n <span lang="ZH-CN" style="font-family:"MS Mincho"">星</span> .<span style="font-family:MingLiU"><br>
<br>
</span>Thank you so much for your patience and kindness.<span style="font-family:MingLiU"><br>
<br>
</span>Lulu<span style="font-family:MingLiU"><br>
<br>
</span>On Wednesday, June 29, 2016 at 2:41:46 PM UTC-4, Brian MacWhinney wrote: <o:p>
</o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-right:0in">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:Calibri">Dear Lulu,</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">I think you want +s:[- zho]” in this case, not –s”[- zho]”  When I run</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">freq +s"[- yue]" +t*CHI *.cha +u</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">on CharlotteEng, I get both the English words marked as @s and the Cantonese. 
</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">--Brian</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span style="font-family:Calibri;color:black">From:
</span></b><span style="font-family:Calibri;color:black">ChiBolts <<a href="javascript:" target="_blank">chib...@googlegroups.com</a>> on behalf of Lulu <<a href="javascript:" target="_blank">lulu...@gmail.com</a>><br>
<b>Reply-To: </b>ChiBolts <<a href="javascript:" target="_blank">chib...@googlegroups.com</a>><br>
<b>Date: </b>Tuesday, June 28, 2016 at 10:34 PM<br>
<b>To: </b>ChiBolts <<a href="javascript:" target="_blank">chib...@googlegroups.com</a>><br>
<b>Subject: </b>Re: Running FREQ for bilingual transcripts</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<blockquote style="border:none;border-left:solid #B5C4DF 4.5pt;padding:0in 0in 0in 4.0pt;margin-left:3.75pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Hi Brian,<br>
<br>
I tried to run the reverse command on the same transcript (mostly English with a dozen words in Chinese)<br>
<br>
freq +tTCH -s"[- zho]" +s”*@s” *.cha (I added * after @s because my transcript also tags if the @s word is a noun or a verb)<span style="font-family:MingLiU"><br>
<br>
</span>hoping to add the few @s English words embedded in [- zho] lines to the English word counts, but only got 0's. With +s"*@s*" removed, I get good results which don't include the @s English words. Not sure how I can fix this.<br>
<br>
Thanks!<br>
<br>
Lulu<br>
<br>
On Tuesday, June 28, 2016 at 10:22:40 PM UTC-4, Lulu wrote: <o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Dear Brian,<br>
<br>
That just did magic! Thank you so much!<br>
<br>
Best,<br>
Lulu<br>
<br>
On Tuesday, June 28, 2016 at 10:15:26 PM UTC-4, Brian MacWhinney wrote: <o:p></o:p></p>
<blockquote style="border:none;border-left:solid #CCCCCC 1.0pt;padding:0in 0in 0in 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0in;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:Calibri">Dear Lulu,</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">Without seeing your transcripts, I can’t say exactly what is wrong.  However, if you run this similar command on the CharlotteEng folder in the YipMatthews corpus, you get good results:</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">freq +t*CHI +s"[- yue]" *.cha</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">The idea is that this will include all words on the [- yue] lines including those with @s, although the latter are pretty rare.  If you want to exclude those, just add –s”*@s”</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;text-indent:9.0pt">
<span style="font-size:11.0pt;font-family:Calibri">-- Brian MacWhinney</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:Calibri"> </span><o:p></o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span style="font-family:Calibri;color:black">From:
</span></b><span style="font-family:Calibri;color:black">ChiBolts <chib...@googlegroups.com> on behalf of Lulu <lulu...@gmail.com><br>
<b>Reply-To: </b>ChiBolts <chib...@googlegroups.com><br>
<b>Date: </b>Tuesday, June 28, 2016 at 5:10 PM<br>
<b>To: </b>ChiBolts <chib...@googlegroups.com><br>
<b>Subject: </b>Running FREQ for bilingual transcripts</span><o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Hi Brian and team members,<br>
<br>
I ran the freq command<br>
<br>
freq +tTCH +s"[- zho]" *.cha<br>
<br>
for transcripts that contain bilingual utterances (e.g., *TCH:    [- zho] this@s$n
<span lang="ZH-CN" style="font-family:"MS Mincho"">星</span>). The dominant language of the transcripts was English so we marked utterances that contained Chinese with [- zho]. The output types and tokens included all the English words that were marked @s. I
 thought I would get the types and tokens of all the Chinese words by running the above command. Is the problem with the transcript or the command?<br>
<br>
Thank you!<br>
<br>
Lulu<o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">--
<br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+u...@googlegroups.com.<br>
To post to this group, send email to chib...@googlegroups.com.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/0e5d867c-79b1-4d36-87be-1303f390a83b%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank">
https://groups.google.com/d/msgid/chibolts/0e5d867c-79b1-4d36-87be-1303f390a83b%40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank">
https://groups.google.com/d/optout</a>.<o:p></o:p></p>
</div>
</div>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">--
<br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a href="javascript:" target="_blank">chibolts+u...@googlegroups.com</a>.<br>
To post to this group, send email to <a href="javascript:" target="_blank">chib...@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/56bb5824-2983-4e64-9111-42841037333f%40googlegroups.com?utm_medium=email&utm_source=footer" target="_blank">
https://groups.google.com/d/msgid/chibolts/56bb5824-2983-4e64-9111-42841037333f%40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout" target="_blank">
https://groups.google.com/d/optout</a>.<o:p></o:p></p>
</div>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal">-- <br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br>
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/948766f2-00d7-4ce1-a775-af37ae759393%40googlegroups.com?utm_medium=email&utm_source=footer">
https://groups.google.com/d/msgid/chibolts/948766f2-00d7-4ce1-a775-af37ae759393%40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<o:p></o:p></p>
</div>
</div>
</blockquote>
</div>
</body>
</html>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups "chibolts" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/C39C8F89-C619-4FDA-BAFD-3E141968A709%40cmu.edu?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/C39C8F89-C619-4FDA-BAFD-3E141968A709%40cmu.edu</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />