Working with codes

Brian MacWhinney macw at cmu.edu
Fri Jun 20 21:58:35 UTC 2008


Dear Susanna,

      Sorry about the delay in replying.  I have been traveling.   Let  
me try to answer some of these questions below.

--Brian MacWhinney

On Jun 19, 2008, at 7:43 AM, bartsch at zas.gwz-berlin.de wrote:

>
> Dear all,
>
> For a study on anaphora, we are coding referring expressions in
> children's narratives and I have some questions concerning the coding
> line (we use %cod), as well as subsequent CLAN analyses.
>
> First, is such a %cod line legal?
> *CHI:	der hund beisst sie in den schwanz.
> %cod:	der hund|S-DA:N-BL-V1:3-AS-DIR-hundC sie|O-PRO:PP-BL-V3:1-AS- 
> DIS-
> katzeC den schwanz|NSO-DA:N-UBL-NV-IND-schwanzC
>

For the dependent tier lines like %cod, pretty much everything is  
legal, since the programs
don't presume any particularly structure on this line.  For these  
lines, the main issue is a practical one relating to composing the +s  
switch when you need to do searching.  Just make sure that you can  
find the things you want to find by testing out some FREQ or KWAL  
commands in advance.

> We use the minus symbol - for separating 7 levels of coding of each
> referring expression, e.g., syntactic position, lexical realisation,
> in/animacy, referent introduction vs. reference maintenance, etc. The
> symbol : is used for separating sub-levels within each of the 7
> superordinated levels.

This is fine.  You will have to have search strings like +s"*-*-*-*-BL- 
*" and such.  Personally, I would find this confusing and prone to  
error, but if you are good at asterisk counting, this will work.
>
>
> Secondly, is there any possibility to link each referring expression
> on the *CHI line with its coding on the %cod line? Provisionally, we
> opted for typing the referring expression before the coding string,
> e.g., 'die katze'.

Ah, herein lies the rub (somewhere in Shakespeare).  You are basically  
trying to construct something like the %mor line with its 1-to-1 match  
to the main line.  This is a great idea.  However, the CLAN software  
is not yet really ready for this.  We are currently right in the  
middle of implementing strict 1-to-1 matching between the %mor and the  
main tier within the XML version of CLAN.  Once this is finished then  
"match" searches will work with the %mor line.  At that point, it  
would be relatively easy to extend this to a tier called %mat for a  
user-defined matching tier.  However, none of this will be ready until  
later this year.
>
>
> Thidly and most importantly, we want to conduct analyses concerning
> the cooccurrence of elements within each coding string. For instance,
> we want to investigate differences in children's realisation of
> referents as a function of referent introduction vs. anaphorical
> expression (reference maintenance). For that, we want to find a range
> of cooccurrences as the following:
>
> DA:N and NV and IND
> (where DA:N means definite article + noun, NV means referent
> introduction, and IND means indirect anaphor)

I am not sure what you mean by "range" in your phrase "a range of  
cooccurrences".  However, finding *-DA:N-*^*^*-NV-* should be possible.

> I have tried COMB, but either I don't understand the principle for the
> syntax of the command line or I miss some important switch or, well, I
> don't know what.

You probably just have to play around to learn how to use COMBO.

>
>
> Two things are in such searching procedures very important for us:
> - The search must be limited to each of the coding strings and not be
> based on the whole %cod line. For instanance, when looking for the
> cooccurrence  DA:N and DIS, CLAN would be supposed not to find it in
> the example above, since it doesn't occur in any of the 3 coding
> strings. That is, for this concrete example, how can we proceed for
> ensuring that CLAN ignores the cooccurrence of DA:N for 'the hund' and
> DIS for 'sie'?

That should be easy enough.  In COMBO lines, it is the ^ that searches  
across word boundaries.  Just make sure that your search strings don't  
include the ^.  So, you want
*-DA:N-*-DIS-*

>
> - How can we proceed to get quantitative results of such searches? I
> mean, in addition to the concrete hits showed in the output window,
> it'd be very important to have the number of cooccurrences found in
> each chat file, as well as in all chat files in which the cooccurrence
> was looked for.
>
> I apologize if the answers for my questions are obvious or easy to be
> found  in the CLAN manual. I have read the manual very carefully
> before sending this query, but I don't seem to be able to find the
> needed answers therein.

I don't think you can really learn this stuff by reading the manual.   
You just have
to devote an hour or two to playing around with COMBO.  Think of it as  
a Bach
theme with variations.

--Brian MacWhinney

>
>
> Many, many thanks in advance for any hint.
>
> Kind regards,
> Susanna
>
> *****************************************************************
> Susanna Bartsch
> https://www.zas.gwz-berlin.de/mitarb/homepage/bartsch/
> bartsch at zas.gwz-berlin.de
> Zentrum fuer Allgemeine Sprachwissenschaft (ZAS)
> Centre for General Linguistics
> Schuetzenstr. 18
> 10117 Berlin
> Germany
> Tel. +49 (0)30 20 192 503
> Fax  +49 (0)30 20 192 402
> *****************************************************************
>
> >
>


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts-unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---



More information about the Chibolts mailing list