Coordinating tiers with CLAN
Javier
lpxao at psychology.nottingham.ac.uk
Wed May 14 12:01:36 UTC 2003
Dear Bryan, Patti, and Charles,
I agree completely with Bryan in the fact that COMBO is the most appropriate
command, and you could get a lot out of it, but perhaps because of its
multiple capabilities (or my poor skills) it's a bit tricky.
Therefore, let me just add a comment regarding MODREP before you decide to
re-code all your transcripts.
Note that you can exclude things from the "input tier" with the subcommand
"-q". This way you wouldn't require a perfect match between tiers.
With the CHSTRING command you can do the following text-replacements:
(1) Replace all your determiners in the main tier with a final string like
"INNN":
i.e. where it says: " the "
should say: " theINNN "
where it says: " a "
should say: " aINNN "
NOTE: determiners at the beginning of the utterance should replace "<TAB>the
" with "<TAB>theINNN ".
(2) Replace the rest of words with another final string:
i.e. where it says: " "
should say: "OFFF "
(3) And rectify determiners after the last replacement:
i.e. where it says: "INNNOFFF "
should say: "INNN "
You'll end up with a very odd file. Your original transcript could look like
bellow:
*CHI: TheINNN laINNN [*]OFFF houseOFFF isOFFF niceOFFF .
%mor: det|theINNN det|laINNN
%syn: < MINNNN MINNN >
%err: laINNN =OFFF 0OFFF
But observe that your partially coded %syn and %mor lines now match
perfectly with the *CHI line when using:
MODREP +b*CHI +c%mor -q"*OOOFFF" *.cha
MODREP +b*CHI +c%syn -q"*OOOFFF" *.cha
Even more, you can probably sort out your %err line with similar
replacements.
In fact, that's more or less what COMBO would automatically do.
Just in case, let me remind that in no way should anyone attempt to do any
systematic replacement over the original files!
Best wishes,
Javier Aguado Orea
School of Psychology
University of Nottingham
NG7 2RD
UK
on 13/5/03 6:10 pm, Patti Spinner at pattispinner at hotmail.com wrote:
> Thanks, everyone, for the suggestions. I do have the problem that I don't
> have a one-to-one match between the words in the main line and on the
> dependent tiers. I think I'll go back and try it a few different ways and
> we'll see how it goes! If I think of some useful extensions, I will also
> let you know.
>
> Patti Spinner
>
>
>
>> From: "Brian MacWhinney" <macw at cmu.edu>
>> To: "Patti Spinner" <pattispinner at hotmail.com>
>> CC: info-chibolts at mail.talkbank.org
>> Subject: Re: Coordinating tiers with CLAN
>> Date: Tue, 13 May 2003 11:24:09 -0400 (EDT)
>>
>> Dear Patti, Charles, Javier, and Info-ChiBolts,
>> Javier's answer was extremely accurate and helpful. He pointed out the
>> importance of adding an asterisk to mark the error on the %mor line, the
>> use of freq, and the use of modrep. I would only add a couple of
>> things. First, you can use what the manual calls cross-tier COMBO to
>> spot most of these beasts. The only problem with cross-tier COMBO is
>> that the search strings are tricky to compose, but there are examples in
>> the manual. Second, Javier correctly points out that MODREP requires a
>> one-to-one match between main line and dependent tier. If you are doing
>> simple syntactic category tagging, that is easy for the %syn. And if
>> you are using the automatic MOR, this is always correct there too. The
>> big problem is that the %err line has never been properly integrated
>> with the other programs and this is why you need the asterisk on the
>> %mor line.
>> Javier suggests that enhancements to the MODREP program might help in
>> this area. So, if you folks think that MODREP would be more useful than
>> cross-tier COMBO and if you can specify proposed extensions, we will try
>> to implement them.
>>
>> --Brian MacWhinney
>>
>
> _________________________________________________________________
> Messenger - Wer in Echtzeit kommunizieren will, lädt den MSN Messenger.
> Cool, kostenlos und mit 3D Emoticons: http://messenger.msn.de
>
>
More information about the Chibolts
mailing list