[Corpora-List] Automatically checking a treebank for errors
Ann Bies
annbies at yahoo.com
Sun Jun 19 14:15:42 UTC 2011
Hi, Kevin,
There is also some work at LDC on using a TAG-based decomposition of the treebank to compare syntactic structures that may be relevant:
http://papers.ldc.upenn.edu/ACL2011/DerivationTrees_TBErrorDetection.pdf
Seth Kulick, Ann Bies, and Justin Mott
Using Derivation Trees for Treebank Error Detection
ACL 2011, Portland, Oregon, USA, June 19-24, 2011
Available: Paper in PDF
Thanks,
Ann
--- On Fri, 6/17/11, Kevin B. Cohen <kevin.cohen at gmail.com> wrote:
From: Kevin B. Cohen <kevin.cohen at gmail.com>
Subject: [Corpora-List] Automatically checking a treebank for errors
To: "Corpora List" <corpora at uib.no>
Date: Friday, June 17, 2011, 3:47 PM
Does anyone know of any tricks for automatically checking a Penn
Treebank-style corpus for obvious errors? I've done some simple stuff
in the past for checking POS tags, like looking for punctuation marks
with non-punctuation tags, which turned out to be really fruitful, but
I can't think of anything clever to do for the syntactic structures.
Kev
--
Kevin Bretonnel Cohen, PhD
Biomedical Text Mining Group Lead, Computational Bioscience Program,
U. Colorado School of Medicine
303-916-2417 (cell) 303-377-9194 (home)
http://compbio.ucdenver.edu/Hunter_lab/Cohen
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110619/2ee0d942/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list