[Corpora-List] Tag-set conversion
Timothy Baldwin
tbaldwin at csli.Stanford.EDU
Fri Jan 31 01:32:42 UTC 2003
> Does anybody know of an existing tool to translate between the BNC C5
> tag-set and the Penn Tree Bank tag-set?
Assuming you are running Solaris or Linux, you could use the tools supplied
with cass, as developed by Marc Light and Steve Abney:
http://whorf.sfs.nphil.uni-tuebingen.de/~abney/scol1e.tar.gz
Their use is documented in the cass manual supplied in the tarball, but for
the record, you run:
bncsents BNCFILE | tagfixes -f bnc.fxc
where BNCFILE is a BNC source file.
You could alternatively just retag the BNC using a Penn-style tagger, of
course, given that the BNC data was for the most part automatically tagged.
Tim
More information about the Corpora
mailing list