[Corpora-List] BBN Named Entity annotation of 15 million word OANC now available
Nancy Ide
ide at cs.vassar.edu
Tue Nov 9 17:03:22 UTC 2010
*******************************************************************
BBN Named Entity annotation of the 15 million word Open ANC
*******************************************************************
http://www.anc.org
The American National Corpus (ANC) project has received a contribution of named entity annotation
for the entire 15 million words of the Open American National Corpus, which is now freely
available for download from the ANC website. The annotations were automatically produced
by the BBN named entity tagger (see http://www.ldc.upenn.edu/acl/W/W03/W03-1506.pdf)
and contributed by Sameer Pradhan. The download contains the OANC texts, respecting the
OANC directory structure, with inline annotations in an XML-like format.
The ANC project is in the process of generating a version of these annotations in standoff GrAF
format so that they may be combined with other OANC annotations using the ANC2Go web
application http://www.anc.org:8080/ANC2Go) or the stand-alone ANCTool.
The ANC welcomes contributions of both annotations and texts, which we release for
free download by the community from our website. ANC, OANC, and MASC data and annotations are
or will be also distributed through the Linguistic Data Consortium. To contribute, send email to
anc at anc.org or consult http://www.anc.org/contribute.html.
==============================================================================
THE ANC PROJECT IS COMMITTED TO OPEN DATA FOR LANGUAGE RESEARCH, DEVELOPMENT,
AND EDUCATION. ALL CONTRIBUTIONS OF BOTH DATA AND ANNOTATIONS SHOULD BE
UNENCUMBERED BY LICENSING RESTRICTIONS. ALL CONTRIBUTIONS ARE MADE FREELY AVAILABLE
FOR USE BY THE COMMUNITY.
===============================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20101109/f6204fdb/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list