CHAINS tune up

Leonid Spektor spektor at andrew.cmu.edu
Thu Dec 6 23:38:07 UTC 2007


Nikolai,

     I am going to address your points one at the time:

1) Because of the problem with most of the fonts being of variable width it
is no longer possible to format CLAN output so that it will align properly.
So, we opted to separate each field with a tab instead. This might still not
make output look better in CLAN, but it is a perfect way to copy and paste
it into Microsoft Excel. This, we were told by many people, is a preferred
method of looking a statistical output. The +w option that older version of
CLAN has is just a leftover that we haven't removed at the time. Latest
version of CLAN doesn't have +w option anymore.

2) There might be a misunderstanding as to what +d1 option should do. I am
going to use a file called "chains.cha" as an example. This file in included
with every CLAN distribution and can be found in "lib/samples" folder inside
CLAN folder. This is what CHAINS program does now:

> chains +t%spa chains.cha
chains +t%spa chains.cha
Thu Dec  6 18:25:19 2007
chains (06-Dec-2007) is conducting analyses on:
   ALL speaker tiers
     and those speakers' ONLY dependent tiers matching: %SPA;
****************************************
>>From file <chains.cha>

Speaker markers:  1=*MOT, 2=*CHI

$nia:fp    $nia:gi    $nia:pa    $npp:pa    $npp:yq    line #
0    1    0    0    0        6
2    0    0    0    2        9
0    0    0    1    0       13
0    0    1    0    0       16


> chains +t%spa chains.cha +d1
chains +t%spa chains.cha +d1
Thu Dec  6 18:25:04 2007
chains (06-Dec-2007) is conducting analyses on:
   ALL speaker tiers
     and those speakers' ONLY dependent tiers matching: %SPA;
****************************************
>>From file <chains.cha>

Speaker markers:  1=*MOT, 2=*CHI

$nia:fp    $nia:gi    $nia:pa    $npp:pa    $npp:yq    line #
                         1
                         2
                         3
                         4
                         5
      1                       6
                         7
                         8
2                   2        9
                        10
                        11
                        12
                1            13
                        14
                        15
           1                 16


The output is abbreviated and as you can clearly see looks very much
misaligned, because of variable width font used in this email. But, all the
line numbers found in the file are listed when +d1 option is used.

3) this is a bug and is fixed in the latest version of CLAN. It was a left
over from the situation where user is searching for two or more codes. For
example:

chains +t%cod +s$dat:% +s$nia:% @

Leonid.


On 06-12-07 10:03, "Nikolai Penner" <mnpenner at uwaterloo.ca> wrote:

>
> Hello!
> I have a couple of difficulties with CHAINS which I don't know how to
> resolve. It would be interesting to know if this happens to other as
> well and how one can cope with these issues.
> Thanks a lot in advance!
>
> 1) CHAINS started to ignore the +wN switch and I can't line up the
> columns showing where each code string occurs. This is what I get:
> adj $cons $dat: $dat:pre:plu $dat:stn:loc:sin $gend: $gend:eng $plur
> $prep $rel $syn $syn:miss:inf+zu $verb $voc line #
> 0 0 1 0 0 0 0 0 0 0 0 0 0 0 10
> 0 0 0 0 0 0 0 0 0 1 0 0 0 0 17
> 0 0 0 1 0 0 0 0 0 0 0 0 0 0 33
> 0 0 1 0 0 0 0 0 0 0 0 0 0 0 42
> 0 0 0 0 0 1 0 0 0 0 0 0 0 0 45
> 0 0 0 0 0 0 0 0 0 0 0 0 1 0 55
> 0 1 0 0 0 0 0 0 0 0 0 0 0 0 65
> 0 0 0 0 0 0 0 0 0 0 0 0 0 1 69
> 0 0 0 0 0 0 0 0 0
>
> I tried all kinds of fonts and it still doesn't work. However, I did
> work before when I set the font to Courier, but then I installed an
> updated version of CLAN and then it stopped working.
>
> 2) In the manual as well as in CHAINS help it says that +d1 switch is
> supposed to display every input line in the output. This would be an
> extremely helpful in my research but CHAINS seems to ignore this
> switch as well. The output is no different than without the switch at
> all.
>
> 3) When searching for occurances of a particular code string, CHAINS,
> for some reason, dublicates certain lines and gives them a count of
> 0.
> This is what I get for the following command:
>
>> chains +t%cod  +s$dat:% @
>
> $dat: line #
> 1 pre:plu 10
> 0 10
> 1 pre:plu 33
> 1 loc:sin 42
> 0 42
> 0 42
> 0 42
> 0 42
> 0 42
> 0 42
> 0 42
> 0 42
> 1 pre:plu 123
> 0 123
> 0 123
> 0 123
> 0 123
> 0 123
> 0 123
> 1 stn:loc:sin 183
> 0 183
> 0 183
> I can't figure out why line 10 has one dublicate, and line 42 has 8.
> There is nothing that different about them.
> Also, here, the code pre:plu occured twice but it displayed on two
> different lines with each being assigned a count of 1. Is there a way
> to make CHAINS display the total number of occurances of a particular
> string? When working with large files which have a lot of codes in
> them, it would be a pain going throught the list and trying to count
> how many times each individual code occured.
>
> Once again, thank you in advance!
> Nikolai
>>
>



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "chibolts" group.
To post to this group, send email to chibolts at googlegroups.com
To unsubscribe from this group, send email to chibolts-unsubscribe at googlegroups.com
For more options, visit this group at http://groups.google.com/group/chibolts?hl=en
-~----------~----~----~----~------~----~------~--~---



More information about the Chibolts mailing list