including/excluding @fp and generating new files
F.J.Myles at soton.ac.uk
Fri Dec 13 12:45:44 UTC 2002
as you know, we have been working on SLA corpora for the past year or so. We
are now re-formatting using Childes other French L2 corpora as part of two
new funded research projects. Your e-mail below has got me worried, as a
substantial part of our work in the past year has been to adapt the childes
tools for SLA purposes.
One of the things we have done is to add codes to the dep file, which are
vital for our analysis. As we are using the French Mor parser, and as the
focus of our analysis is morphosyntactic, we have coded words borrowed from
English according to their syntactic category, so that we can still analyse
the sentence structure of utterances which have some English words (a very
common occurrence in our data). For example, in a sentence such as 'j'aime
jouer à table+tennis' (I like playing table tennis), we need to be able to
parse table+tennis as a noun borrowed from English. We have therefore added
@d (nouns), @v (verbs), @a (adjectives) etc. to all words borrowed from
English which fit into a French sentence pattern, so that the
morphosyntactic parser comes up with the correct analysis. Similarly, we
have added @n for a number of neologisms which we have added to the French
lexicon, again ensuring correct analysis by the parser.
Am I right in thinking that we cannot do that any longer? Being able to
adapt Childes for SLA purposes in this way was one of the major reasons for
us chosing to use it in the first place, and I do hope I am wrong!
On a different topic, we have been experiencing problems with the new
version of Clan, which keeps telling us that the @participants tier is
missing, when it is not. Any suggestions?
with best wishes,
Dr Florence Myles
Senior Lecturer in French and Linguistics
School of Modern Languages
University of Southampton
Southampton SO17 1BJ
tel (0)23 80 592269
fax (0)23 80 593288
e-mail: fjm at soton.ac.uk
----- Original Message -----
From: "Brian MacWhinney" <macw at cmu.edu>
To: <info-chibolts at mail.talkbank.org>
Cc: "Sophie and Yiannis Georgiou" <yiansoph at cytanet.com.cy>
Sent: Thursday, December 05, 2002 5:15 PM
Subject: Re: including/excluding @fp and generating new files
> On 12/5/02 2:57 AM, "Sophie and Yiannis Georgiou"
<yiansoph at cytanet.com.cy>
> > Hi to all!
> > I'm a new user of CLAN.
> > I have used the symbol @gr after words
> > that are greek (the main language of the
> > transcripts is English). How do I create
> > a new file for @gr to get CHECK to accept
> > it as a symbol and for MLT and MLU to count
> > the words?
> > I'm also interested in both including them
> > and excluding them. How can I get the
> > programme to exclude these words?
> > This also goes for the @fp symbol which I've
> > used for filled pauses. How can I run MLU
> > and MLT exluding these from the count?
> > Sophie
> Dear Sophie,
> You should use the symbol @s for second language. In the new XML
> framework for CHAT, we have to be more restrictive about the shape of
> so you can't just use @gr or some extension you create. Older versions of
> CLAN allowed this using the 00depadd.cut file, but new versions do not, so
> now you must use @s.
> To exclude your @s words in the various programs, you typically use the
> -s"*\@s" switch which should exclude all words ending in @s. If you want
> exclude filled pauses such as um, you should mark them as um at fp and
> them in a similar manner. By default @fp and @s are included, unless you
> specifically exclude them. The rules on default exclusions in MLU are
> complex and it is always best to consult the manual carefully when doing
> these analyses.
More information about the Chibolts