scoping of a clause delimiter

'Monika Bader' via chibolts chibolts at googlegroups.com
Thu May 9 12:11:13 UTC 2019


Dear Brian, 
thanks for responding so quickly! We are working with written texts 
produced by young foreign language learners and want to code both for 
clauses and T-units/C-units to get ratios such as clauses per T-unit (a 
measure for which scoping is not crucial). We are relatively new to CLAN 
and had a tutorial with Victoria Johansson from Lund University, where they 
used CLAN for a big project on language development. They used the 
following strategy to encode T-units and clauses: clauses were placed on 
separate chat lines, while T-units were separated using @EndTurn. 
Center-embedded clauses (which there are many of in Swedish) were repeated 
on a separate subordinate tier, named %ces. 
After reading the CHAT/CLAN manuals, we realized that an alternative option 
would be to place T-units on separate chat lines, and use [^c] to code for 
clauses. We have been since grappling with the possible consequences of 
choosing one strategy over the other, so if you have some thoughts around 
this, we would be happy to get your advice. We want later to share the 
corpus with our students for further analyses (and further coding), so we 
are trying to think carefully about the different options.So far we are 
leaning towards the [^c] strategy. Though T-units have been used in the 
literature to measure development, we are also interested in conducting a 
more detailed analysis, and investigating what kind of clauses and 
structures are inside those T-units to better understand what our learners 
can do and which clauses and structures they rely on. We are also 
interested in investigating the extent to which they use what LGSWE calls 
syntactic nonclausal structures (which seem to be quite common in our data 
set). We have the impression that a lot of this information can be encoded 
by modifying [^c]. It seems to us that there might be issues related to 
scoping if one then wanted to limit the search to features related to 
subtypes of [^c] (for instance, number of errors/or certain types of errors 
in certain types of clauses, such as relative clauses). It is possible that 
as you suggest these are better handled by relying on the %gra line, though 
I think we would have to conduct an extensive manual error analysis if we 
wanted to get a relatively accurate %gra line. 

Best,
Monika

onsdag 8. mai 2019 16.01.54 UTC+2 skrev macw følgende:
>
> Dear Monika,
>     I am not familiar with work that calculates MLU based on clauses and I 
> am not sure why one would want to use such a measure.  The major point of 
> MLU is to consider the extent to which speakers compose more complex 
> sentences and the act of breaking up sentences into clauses would actually 
> remove the thing that it is trying to measure.  
>    As you say, this system of clause marking is definitely not going to 
> work well for center embedding.  You could get the scope of the embedded 
> clause, but then the main clause would be broken up.  But perhaps that is 
> interesting in itself.  
>     I am curious why you are using this type of analysis.  What exactly 
> are you interested in measuring.  It seems to me that, if you have a 
> relatively accurate %gra line for an utterance, then that could be more 
> useful that hand-done clause marking.
>
> --Brian MacWhinney
>
> On May 8, 2019, at 7:48 AM, 'Monika Bader' via chibolts <
> chib... at googlegroups.com <javascript:>> wrote:
>
> Hi,
> we are trying to decide on the best way to code for clauses. The manual 
> suggests using a clause delimiter, and we quite like this option 
> (especially the possibility of creating user defined codes). However, we 
> are somewhat worried about the scoping of the symbol. We understand that 
> for some analyses, such as MLU/MLT based on clauses, this is not a crucial 
> issue, but we do believe that for some other analyses one would need the 
> right scoping (if we are not mistaken). For instance, calculating words per 
> error free clauses (or any other clause code one uses). In examples such as 
>
> The book [that you buyed yesterday] [^c err] has disappeared [^c] 
>
> [^c err] would scope over "the book" as well, which we wouldn't 
> necessarily want to include. In some languages these kinds of 
> nested/center-embedded clauses are more common than in other. The manual 
> says that "it is not necessary to mark the scope", but is it possible? or 
> is there any other way to deal with cases such as these?
>
> We appreciate any suggestions!
>
> Best,
> Monika
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "chibolts" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to chib... at googlegroups.com <javascript:>.
> To post to this group, send email to chib... at googlegroups.com 
> <javascript:>.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/chibolts/9212b6f1-ad4c-4183-b277-45711661af3c%40googlegroups.com 
> <https://groups.google.com/d/msgid/chibolts/9212b6f1-ad4c-4183-b277-45711661af3c%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>
>
>

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/c3090979-aa74-4ec2-a168-5b39729f69cd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20190509/9cdd130a/attachment.htm>


More information about the Chibolts mailing list