combo using clause delimiters

Bruno Estigarribia brunilda at gmail.com
Tue Apr 22 21:21:00 UTC 2014


Hello everyone,

I have a code-switching transcript where we used [^c] as a clause delimiter 
when a line (=utterance) consisted of more than one clause.
We have also used @1 and @2 as word markers for each one of the two 
languages. And we have used @4 to mark mixed words. An example line follows 
(please ignore the morphological markings on the main tier for the 
moment--I've discussed this in a different thread and we intend to replace 
them with a proper MOR tier):

*RAM:    Che~niko at 1 che#felí(z)@4 con at 2 mi at 2 concubin-o at 2 akue at 1 , [^c]
    ha(s)ta at 2 que at 2 un at 2a at 2 fatale at 2 a#menda at 1 hese at 1 por at 2
    liga-ite at 4 , [^c] nunca at 2 má@2 a#de(s)cansá@4 , [^c] ha(s)ta at 2 
    que at 2 a#heja at 1 ichupe at 1 [^c] .

I want to find and count all mixed CLAUSES (intraclausal switching, 
excluding interclausal switching). The best I could come up with was this 
command:
combo +r5 +t* +s(*\@1^*^![\^c]^*^*\@2)+(*\@2^*^![\^c]^*^*\@1)+(*\@4) +f

This outputs and retrieves all lines with any sort of mix, so for example 
the line above would be output once. We want to output each matched CLAUSE 
(so the line above would give actually 4 output matches, since all 4 
clauses have some kind of mixing (note that this is not the same as 
outputting each match, since we collapse all matches obtained within a 
single clause--see the first clause in the example above).
I know that MLU has the +C option to work on clauses rather than 
utterances, but it is limited to MLU.
I assume I can transform all clauses into unique lines by using the 
transcription break terminator +. and use COMBO the normal way. But is 
there another (perhaps more elegant) solution?
Thank you
Bruno

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/7ed33e08-fd66-4ca7-9880-aba5e4dd935f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20140422/2e985a6c/attachment.htm>


More information about the Chibolts mailing list