Corpora: Knowledge about TreeTagger and MXPOST???

Steven Bird sb at unagi.cis.upenn.edu
Wed Sep 13 23:54:56 UTC 2000


Rachel Aires writes:
> I want to know wich is the key to understand the information
> that is in the file created by TreeTagger and wich is the
> relation among the 10 files created by MXPOST.
> Someones has ever asked himself that?

This situation is an instance of a more general problem with the babel
of formats, which motivated a forthcoming special issue of Speech
Communication:

  Speech Annotation and Corpus Tools
  http://www.ldc.upenn.edu/annotation/specom.html
  (Steven Bird & Jonathan Harrington, eds)

Various projects are addressing this need for standard formats, and
a good starting point is the Linguistic Annotation page.

  Linguistic Annotation
  http://www.ldc.upenn.edu/annotation/
  (Steven Bird & Mark Liberman, eds)

Updates welcome...

Steven Bird

--
Steven.Bird at ldc.upenn.edu  http://www.ldc.upenn.edu/sb
Assoc Director, LDC; Adj Assoc Prof, CIS & Linguistics
Linguistic Data Consortium, University of Pennsylvania
3615 Market St, Suite 200, Philadelphia, PA 19104-2608



More information about the Corpora mailing list