[Corpora-List] Converting the LDC NANTC to XML
Scott James Cederberg
cederber at csli.stanford.edu
Thu Jun 12 20:16:14 UTC 2003
Hello corpora folks,
I'm attempting to convert the LDC North American News Text
Corpus (NANTC; LDC95T21) to XML, using the OSX tool (descended
from James Clark's SX).
Has anyone else done this? One thing that stands in the way is
that we don't have a DTD for the NANTC SGML format; does anyone
have one?
Any help/pointers/advice appreciated.
Scott Cederberg
CSLI
Stanford University
More information about the Corpora
mailing list