Central Australian Study/SHOEBOX

Brian MacWhinney macw at cmu.edu
Tue Dec 10 21:08:11 UTC 2002


Dear Patrick,

  Your project on Aboriginal language sounds fascinating. Your question
regarding the issue of interactions between Shoebox and CHILDES programs and
data is an important one.  Shoebox is a wonderful tool for purposes of basic
field linguistics.  It tends to emphasize the development of a set of
carefully-entered morphological analyses for a custom dictionary of lexical
forms.  As the linguist continues to work on the language, each new
utterance is analyzed in terms of its match to the previously entered
lexical items. One of the nicest features of Shoebox is the fact that the
resultant morphological analyses are nicely arranged up using interlinear
alignment.
  CHILDES tools have certain limitations when compared to Shoebox, but also
many other advantages.  Instead of building up a morphological analysis word
by word, in CHILDES you use the MOR grammar system to analyse words.  To do
this best, you need to learn how to build MOR grammars and that is not too
easy.  Alternatively, you can use the minMOR system which is pretty close to
ShoeBox in requiring you to enter each form one-by-one.  You can combine the
two.  For example, you could use analytic MOR for nouns and minMOR for
verbs.
  The major advantages of the CHAT format and the CLAN programs relate to
their better ability to represent and process larger transcripts.  They are
better designed for larger corpora, alignment with audio and video, and
analysis using search programs.
  We have begun work on making CHILDES and ShoeBox more interactive.  For
example, Mike Maxwell at the University of Pennsylvania is working on
³rescuing² some old Shoebox corpora on Native American languages to convert
them to CHAT format.  This is becoming easier now, since the CHAT format now
can be expressed in XML, which is a convenient translation medium between
different formats.  Once we have developed an XML translator for ShoeBox, we
can translate between formats.  However, I don¹t know how far Mike has
gotten on this.
   Once the translator is written, you could use either Shoebox or CHILDES
and perhaps move your data back and forth between the two.  However, my
guess is that you will lose some information when going from CHAT back to
Shoebox, since Shoebox format is less structured in some regards.
  You may also wish to consult Steven Bird at Melbourne sb at cs.mu.oz.au.
Steven has built some great tools using his AG format which is compatible in
some ways with the CHILDES CHAT format.  He has a lot of experience in field
linguistics and computational linguistics.
   To find CHILDES users in Australia, you can go to the membership list on
the home page and open it with username member and password babbling to
search for Australia as a country.

Good luck,

Brian MacWhinney



More information about the Info-childes mailing list