combo using clause delimiters

Brian MacWhinney macw at cmu.edu
Wed Apr 23 17:37:30 UTC 2014


Bruno,
    Yes, this will work in the sense that MLU will compute by combining the clauses into utterances. However, I note that you are using LIDES coding conventions that we stopped encouraging about 12 years ago.  It would be better if you could use current CHAT conventions.

--Brian 

On Apr 23, 2014, at 8:25 AM, Bruno Estigarribia <brunilda at gmail.com> wrote:

> (Sorry, sent unfinished by mistake)
> Thank you Leonid. Just to clarify: transforming this
> *RAM:    Che~niko at 1 che#felí(z)@4 con at 2 mi at 2 concubin-o at 2 akue at 1 , [^c]
>     ha(s)ta at 2 que at 2 un at 2a at 2 fatale at 2 a#menda at 1 hese at 1 por at 2
>     liga-ite at 4 , [^c] nunca at 2 má@2 a#de(s)cansá@4 , [^c] ha(s)ta at 2 
>     que at 2 a#heja at 1 ichupe at 1 [^c] .
> 
> into this
> 
> *RAM:    Che~niko at 1 che#felí(z)@4 con at 2 mi at 2 concubin-o at 2 akue at 1 , [^c] +.
> *RAM:    ha(s)ta at 2 que at 2 un at 2a at 2 fatale at 2 a#menda at 1 hese at 1 por at 2
>     liga-ite at 4 , [^c] +.
> *RAM     nunca at 2 má@2 a#de(s)cansá@4 , [^c] +.
> *RAM:    ha(s)ta at 2
>     que at 2 a#heja at 1 ichupe at 1 [^c] .
> 
> would work, right? Now each tier contains only one clause. But this causes other problems, in that now you cannot do measures on utterances anymore, correct? There is no way for any program to see a transcription break +. and recognize that that tier's content in in the same utterance as something that follows...
> Thanks
> Bruno
> 
> On Tuesday, April 22, 2014 5:21:00 PM UTC-4, Bruno Estigarribia wrote:
> Hello everyone,
> 
> I have a code-switching transcript where we used [^c] as a clause delimiter when a line (=utterance) consisted of more than one clause.
> We have also used @1 and @2 as word markers for each one of the two languages. And we have used @4 to mark mixed words. An example line follows (please ignore the morphological markings on the main tier for the moment--I've discussed this in a different thread and we intend to replace them with a proper MOR tier):
> 
> *RAM:    Che~niko at 1 che#felí(z)@4 con at 2 mi at 2 concubin-o at 2 akue at 1 , [^c]
>     ha(s)ta at 2 que at 2 un at 2a at 2 fatale at 2 a#menda at 1 hese at 1 por at 2
>     liga-ite at 4 , [^c] nunca at 2 má@2 a#de(s)cansá@4 , [^c] ha(s)ta at 2 
>     que at 2 a#heja at 1 ichupe at 1 [^c] .
> 
> I want to find and count all mixed CLAUSES (intraclausal switching, excluding interclausal switching). The best I could come up with was this command:
> combo +r5 +t* +s(*\@1^*^![\^c]^*^*\@2)+(*\@2^*^![\^c]^*^*\@1)+(*\@4) +f
> 
> This outputs and retrieves all lines with any sort of mix, so for example the line above would be output once. We want to output each matched CLAUSE (so the line above would give actually 4 output matches, since all 4 clauses have some kind of mixing (note that this is not the same as outputting each match, since we collapse all matches obtained within a single clause--see the first clause in the example above).
> I know that MLU has the +C option to work on clauses rather than utterances, but it is limited to MLU.
> I assume I can transform all clauses into unique lines by using the transcription break terminator +. and use COMBO the normal way. But is there another (perhaps more elegant) solution?
> Thank you
> Bruno
> 
> -- 
> You received this message because you are subscribed to the Google Groups "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
> To post to this group, send email to chibolts at googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/161bbc60-d2c9-41ed-a9b7-67cd838d2550%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/56BBBB20-1023-414D-91D2-9459EE4648FB%40cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20140423/15d23c4e/attachment.htm>


More information about the Chibolts mailing list