[Corpora-List] Automatically checking a treebank for errors

Nitin Madnani nmadnani at gmail.com
Sun Jun 19 20:47:14 UTC 2011


There's another poster at ACL 2011 that's relevant:

 	S-4    Automatic Detection and Correction of Errors in Dependency Treebanks 
Alexander Volokh and Günter Neumann
DFKI

- Nitin

On Jun 19, 2011, at 10:15 AM, Ann Bies <annbies at yahoo.com> wrote:

> Hi, Kevin,
> 
> There is also some work at LDC on using a TAG-based decomposition of the treebank to compare syntactic structures that may be relevant:
> 
> http://papers.ldc.upenn.edu/ACL2011/DerivationTrees_TBErrorDetection.pdf
> 
> Seth Kulick, Ann Bies, and Justin Mott
> Using Derivation Trees for Treebank Error Detection
> ACL 2011, Portland, Oregon, USA, June 19-24, 2011
> Available: Paper in PDF
> 
> Thanks,
> 
> Ann
> 
> 
> --- On Fri, 6/17/11, Kevin B. Cohen <kevin.cohen at gmail.com> wrote:
> 
> From: Kevin B. Cohen <kevin.cohen at gmail.com>
> Subject: [Corpora-List] Automatically checking a treebank for errors
> To: "Corpora List" <corpora at uib.no>
> Date: Friday, June 17, 2011, 3:47 PM
> 
> Does anyone know of any tricks for automatically checking a Penn
> Treebank-style corpus for obvious errors?  I've done some simple stuff
> in the past for checking POS tags, like looking for punctuation marks
> with non-punctuation tags, which turned out to be really fruitful, but
> I can't think of anything clever to do for the syntactic structures.
> 
> Kev
> 
> -- 
> Kevin Bretonnel Cohen, PhD
> Biomedical Text Mining Group Lead, Computational Bioscience Program,
> U. Colorado School of Medicine
> 303-916-2417 (cell) 303-377-9194 (home)
> http://compbio.ucdenver.edu/Hunter_lab/Cohen
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110619/e2994cbe/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list