[Corpora-List] Automatically checking a treebank for errors
Adriane Boyd
adriane at ling.ohio-state.edu
Sat Jun 18 08:38:48 UTC 2011
Hi Kevin,
Please check out work by Markus Dickinson and Detmar Meurers on error
detection in corpus annotation:
http://decca.osu.edu/
For POS and Penn treebank-style annotation, the relevant publications are
from 2003-2005. The DECCA software includes code for detecting errors in
POS annotation, Penn treebank-style syntax trees, syntactic annotation
with discontinuous constituents, and dependency annotation.
-Adriane
On Fri, 17 Jun 2011, Kevin B. Cohen wrote:
> Does anyone know of any tricks for automatically checking a Penn
> Treebank-style corpus for obvious errors? I've done some simple stuff
> in the past for checking POS tags, like looking for punctuation marks
> with non-punctuation tags, which turned out to be really fruitful, but
> I can't think of anything clever to do for the syntactic structures.
>
> Kev
>
>
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list