[Corpora-List] Automatically checking a treebank for errors

Kevin B. Cohen kevin.cohen at gmail.com
Fri Jun 17 19:47:37 UTC 2011


Does anyone know of any tricks for automatically checking a Penn
Treebank-style corpus for obvious errors?  I've done some simple stuff
in the past for checking POS tags, like looking for punctuation marks
with non-punctuation tags, which turned out to be really fruitful, but
I can't think of anything clever to do for the syntactic structures.

Kev

-- 
Kevin Bretonnel Cohen, PhD
Biomedical Text Mining Group Lead, Computational Bioscience Program,
U. Colorado School of Medicine
303-916-2417 (cell) 303-377-9194 (home)
http://compbio.ucdenver.edu/Hunter_lab/Cohen

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list