CLAN and Excel

bartsch bartsch at zas.gwz-berlin.de
Mon Mar 2 22:31:43 UTC 2009


Dear Leonid,

Thank you. Interlinear answers below.


On Mon, 02 Mar 2009 14:58:08 -0500, Leonid Spektor <spektor at andrew.cmu.edu>
wrote:
> Susanna,
> 
>     The search patterns that you use to find codes in your data are
> overlapping. For example, pattern "%|S-%" will match all codes that
> pattern
> "%|S-PRO:PP-%" will match, plus more. The pattern "%|%-PRO:PP-%" will
> match
> all codes that both patterns "%|S-PRO:PP-%" and "%|O-PRO:PP-%" will
match,
> plus some more. This means that you have some very general patterns that
> will override the more specific patterns. To deal with this you need to
> priorities the search patterns order. The more general patterns, such as
> "%|S-%" and "%|O-%", need to be listed last and the more specific
> patterns,
> such as "%|S-PRO:PP-%" and "%|O-PRO:PP-%", need to be listed first. This
> is
> a basic rule in CLAN, all the processing is done from left to right and
> from
> top to bottom. The new sequence of search patterns is:
> 
> %|S-DA:T-%
> %|O-DA:T-%
> %|S-PRO:PP-%
> %|O-PRO:PP-%
> %|%-DA:T-%
> %|%-PRO:PP-%
> %|S-%
> %|O-%
> 
> I have changed the code.cut file to reflect this and I am attaching it to
> this message.

I tried with the long command, as well as with the code.cut file - in both
cases, CLAN freezes. I tried it several times, I restarted the computer and
tried again and again - without success.


> 
> The last this to remember is that CLAN will only list patterns that it
> actually can match to something is the data, so if some pattern doesn't
> match anything in the data set, then there isn't going to be a column
> created for it in the output.

I see. In the case of coding I think it wouldn't be bad to get a column
also for such pattern-data non-matching cases, though. 
But more important for the moment: As you can see from the Excel file I
sent you, in one of the missing columns (|-PRO:PP-) one child did have 2
tokens for the searched coding string (combo found them, and they are also
in the transcript). However, the column was not generated... What might
have happened?

Kind regards,
Susanna

> On 02-03-09 08:06, "bartsch" <bartsch at zas.gwz-berlin.de> wrote:
> 
>> 
>> Dear Leonid,
>> 
>> thank you again for your hints. Some interlinear answers below.
>> 
>> 
>>> 
>>> Susanna,
>>> 
>>>     Try this command:
>>> 
>>> freq +d2 +t at ID="*target_child*" -t* +t%cod +s"%|S-%" +s"%|O-%"
>>> +s"%|%-DA:T-%" +s"%|%-PRO:PP-%" +s"%|S-DA:T-%" +s"%|O-DA:T-%"
>>> +s"%|S-PRO:PP-%" +s"%|O-PRO:PP-%"
>>> 
>>> Notice I have replaced all the '*' characters with '%' character. The
>> above
>>> example should be on one command line. If this is too much, then you
> can
>> use
>>> the file "codes.cut" that I am attaching to this email with this
> command:
>>> 
>>> freq +d2 +t at ID="*target_child*" -t* +t%cod +s at codes.cut
>>> 
>>> If this doesn't help you, then please send me a sample of your data
> file
>> and
>>> further description of what exactly didn't work for you.
>> 
>> Well, some things worked, others didn't. Regardless of using either the
>> full long command or the codes.cut file, the output were the same with
>> following 2 problems:
>> 
>> 1. There were no columns for two searched coding strings: "%|%-PRO:PP-%"
>> and "%|O-PRO:PP-%"
>> 
>> 2. Frequencies in half the columns were different when checking them
> using
>> a combo command.
>> 
>> I'll send you a sample of my data directly to you, and a message with
> more
>> details.
>> 
>> 
>>>     In my tests I did not see an extra column between Œsituation¹ and
>> the
>>> first word of the concordance. Perhaps the @ID header tiers in your
> data
>>> files have an extra element at the end. You can see the correct output
> by
>>> looking at "sample.cha" and running commands:
>>> 
>>> freq +d2 +s"pro:%|%" +s"pro|%" sample.cha +t at ID="*mother*" -t* +t%mor
>>> statfreq stat.out.cex +f +d
>>> 
>>> on it. The "sample.cha" file located in clan/lib/sample/ folder.
>> 
>> 
>> Yes, curious, I hadn't this extra column anymore, although I didn't make
>> any changes in the @ID header tiers...
>> 
>> Thank you in advance for further help.
>> 
>> Kindest regards,
>> Susanna

*****************************************************************
Susanna Bartsch
bartsch at zas.gwz-berlin.de
http://www.zas.gwz-berlin.de/mitarb/homepage/bartsch
Zentrum fuer Allgemeine Sprachwissenschaft (ZAS)
Centre for General Linguistics
Schuetzenstr. 18
10117 Berlin
Germany
Tel. +49 (0)30 20 192 503
Fax  +49 (0)30 20 192 402 
*****************************************************************




* * * * * * * * * * * * * * * Avira MailGate NOTICE * * * * * * * * * * * * * * *

Avira MailGate has processed a mail addressed to you, which contained no known
potential malicious software.

In case you notice abnormal behavior of your software after opening the
mail or one of its attachments, please forward the complete mail to
Avira GmbH <mailto:support at avira.com> so it can be
checked for unknown new potential malicious software.

-- 
Avira MailGate

Copyright (c) 2008 by Avira GmbH.
All rights reserved.
For more information see http://www.avira.com/

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts+unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---



More information about the Chibolts mailing list