<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <p>Alex,</p>
    <p>Here is are two commands that will do what I think you want. They
      are not extremely elegant, but then again nothing involving
      regular expressions search is. For English to French switch try
      this command:</p>
    <p>combo +b2 -l +s"\**:^*s:eng^*^\**:^*s:fra" *.cha</p>
    <p>And for French to English switch try this command:</p>
    <p>combo +b2 -l +s"\**:^*s:fra^*^\**:^*s:eng" *.cha</p>
    <p>If this is not working well for you and Gladys, then I really
      need you to email to me directly a sample of your data file, so
      that I can see all tags and their use in the file in order to
      suggest a more precise command. I understand that this feature is
      very valuable to studying bilingual data, so we might even try to
      add some new features to CLAN to do a better job at searching for
      language switching.<br>
    </p>
    <p><br>
    </p>
    <pre class="moz-signature" cols="72">
Leonid.

</pre>
    <div class="moz-cite-prefix">On 24-03-17 06:50, A Cristia wrote:<br>
    </div>
    <blockquote
      cite="mid:36b82d51-4b8b-457b-ae97-37a1a484a963@googlegroups.com"
      type="cite">
      <div dir="ltr">Dear Leonid,
        <div><br>
        </div>
        <div>Thank you for the fast response. Gladys would like to
          extract are *pairs* of sentences, one spoken in one language,
          the other in another. Imagine a sequence like this:</div>
        <div>
          <ol>
            <li>English<br>
            </li>
            <li><b>English<br>
              </b></li>
            <li><b>French</b></li>
            <li>French<br>
            </li>
            <li><b>French<br>
              </b></li>
            <li><b>English</b><br>
            </li>
          </ol>
        </div>
        <div>Gladys would like to extract sentences 2-3 (switch
          Eng->Fr), and 5-6 (switch Fr->Eng).</div>
        <div><br>
        </div>
        <div>Of course, this can be approximated by using kwal,
          extracting the [- spa] sentences with some context, and then
          looking through by hand to see if the context is also in
          Spanish (so not a switch) or in Qom (yes, it's a switch, and
          thus part of what we would like to extract). I wonder if there
          is an elegant solution for this in CLAN already.</div>
        <div><br>
        </div>
        <div>If I were to do this in bash, I'd do something not very
          elegant like (imagining there is only the content of the
          transcription):</div>
        <div>sed -E '/[- spa]/!s/^/[- qom]/' | #add [- qom] to all lines
          NOT marked with [- spa]</div>
        <div>   tr '\n' '€' |                             #next replace
          the line breaks by a placeholder</div>
        <div>   sed 's/€\(.....)/\1€\1/g' |       #duplicate the
          language marker on each side of the placeholder<br>
          <div>   tr '€' '\n'  |                            #translate
            back the placeholder into  line breaks</div>
          <div>grep -A 1 -B 1 '[- qom]*[- spa]'  # and finally extract
            sentences that have both language markers</div>
          <div><br>
          </div>
          <div>Does that make more sense? Thank you in advance,</div>
          <div><br>
          </div>
          <div>Alex</div>
          <div><br>
          </div>
          On Thursday, March 23, 2017 at 8:27:02 PM UTC+1, Spektor,
          Leonid: CMU wrote:
          <blockquote class="gmail_quote" style="margin: 0;margin-left:
            0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
            <div bgcolor="#FFFFFF" text="#000000">
              <p>Alex,</p>
              <p>    I am not sure what do you mean by "LANGUAGE
                SWITCH", but you can use +s"[- spa]" option to analyze
                only utterances with "[- spa]" code and -s"[- spa]"
                option to analyze only utterances that do not have "[-
                spa]" code. If this doesn't help, then please email to
                me with more input data files examples and examples of
                output that you want to get.<br>
              </p>
              <pre cols="72">Leonid.

</pre>
              <div>On 23-03-17 14:19, A Cristia wrote:<br>
              </div>
              <blockquote type="cite">
                <div dir="ltr">Dear clan users,<br>
                  <br>
                  In a bilingual corpus, is there a way to search for
                  pairs of sentences where a language switch has
                  occurred? A search for the tagged language will only
                  reveal switches from the minor to the major language,
                  but we'd like to extract both:<br>
                  <br>
                  *FAC:    ʔaqaixana . <br>
                  *FAC:    ten qaica naxa qaicaʔ . <br>
                  *FAC:    [- spa] vamos afuera . <---- LANGUAGE
                  SWITCH FROM THE PREVIOUS SENTENCE TO THIS SENTENCE
                  (major to minor -- can be found searching for [- spa])<br>
                  *FAC:    ñaq qaica ten  paʔatauec na . <----
                  LANGUAGE SWITCH FROM THE PREVIOUS SENTENCE TO THIS
                  SENTENCE (minor to major -- can it be found?)<br>
                  *FAC:    ñaq qaica ten . <br>
                  <br>
                  <br>
                  <br>
                  Thank you in advance,<br>
                  <br>
                  Gladys Ojea and Alex Cristia<br>
                  <br>
                  <br>
                </div>
                -- <br>
                You received this message because you are subscribed to
                the Google Groups "chibolts" group.<br>
                To unsubscribe from this group and stop receiving emails
                from it, send an email to <a moz-do-not-send="true"
                  href="javascript:" target="_blank"
                  gdf-obfuscated-mailto="uaejvwuHAAAJ" rel="nofollow"
                  onmousedown="this.href='javascript:';return true;"
                  onclick="this.href='javascript:';return true;">chibolts+u...@<wbr>googlegroups.com</a>.<br>
                To post to this group, send email to <a
                  moz-do-not-send="true" href="javascript:"
                  target="_blank" gdf-obfuscated-mailto="uaejvwuHAAAJ"
                  rel="nofollow"
                  onmousedown="this.href='javascript:';return true;"
                  onclick="this.href='javascript:';return true;">chib...@googlegroups.com</a>.<br>
                To view this discussion on the web visit <a
                  moz-do-not-send="true"
href="https://groups.google.com/d/msgid/chibolts/b465a75f-66da-4a69-86c1-35cd9bc50ea8%40googlegroups.com?utm_medium=email&utm_source=footer"
                  target="_blank" rel="nofollow"
onmousedown="this.href='https://groups.google.com/d/msgid/chibolts/b465a75f-66da-4a69-86c1-35cd9bc50ea8%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter';return
                  true;"
onclick="this.href='https://groups.google.com/d/msgid/chibolts/b465a75f-66da-4a69-86c1-35cd9bc50ea8%40googlegroups.com?utm_medium\x3demail\x26utm_source\x3dfooter';return
                  true;">https://groups.google.com/d/<wbr>msgid/chibolts/b465a75f-66da-<wbr>4a69-86c1-35cd9bc50ea8%<wbr>40googlegroups.com</a>.<br>
                For more options, visit <a moz-do-not-send="true"
                  href="https://groups.google.com/d/optout"
                  target="_blank" rel="nofollow"
                  onmousedown="this.href='https://groups.google.com/d/optout';return
                  true;"
                  onclick="this.href='https://groups.google.com/d/optout';return
                  true;">https://groups.google.com/d/<wbr>optout</a>.<br>
              </blockquote>
              <br>
            </div>
          </blockquote>
        </div>
      </div>
      -- <br>
      You received this message because you are subscribed to the Google
      Groups "chibolts" group.<br>
      To unsubscribe from this group and stop receiving emails from it,
      send an email to <a moz-do-not-send="true"
        href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br>
      To post to this group, send email to <a moz-do-not-send="true"
        href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br>
      To view this discussion on the web visit <a
        moz-do-not-send="true"
href="https://groups.google.com/d/msgid/chibolts/36b82d51-4b8b-457b-ae97-37a1a484a963%40googlegroups.com?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/36b82d51-4b8b-457b-ae97-37a1a484a963%40googlegroups.com</a>.<br>
      For more options, visit <a moz-do-not-send="true"
        href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br>
    </blockquote>
    <br>
  </body>
</html>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups "chibolts" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/4879f196-817b-9155-1544-6a0374d0e98e%40andrew.cmu.edu?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/4879f196-817b-9155-1544-6a0374d0e98e%40andrew.cmu.edu</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />