<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div> *******************************************************************<br></div><div> Three syntactic annotations of 11 million words of the Open ANC<br></div><div> *******************************************************************</div><br><div>The American National Corpus (ANC) project has received a contribution of three syntactic parses<br></div><div>for 11 million of the 15 million words of the Open American National Corpus, which are now freely<br></div><div>available for download from the ANC website. The annotations were automatically produced<br></div><div><span class="Apple-style-span" style="color: rgb(25, 25, 25); line-height: 15px; ">using the Charniak & Johnson (2005) parser, the MaltParser </span>(Nivre et al., 2007)<span class="Apple-style-span" style="color: rgb(25, 25, 25); line-height: 15px; ">, and the LHT dependency </span></div><div><span class="Apple-style-span" style="color: rgb(25, 25, 25); line-height: 15px; ">converter </span>(Johansson & Nugues, 2007)<span class="Apple-style-span" style="color: rgb(25, 25, 25); line-height: 15px; ">. The annotations </span><span class="Apple-style-span" style="color: rgb(25, 25, 25); line-height: 15px; ">were contributed by Rasul Kalajahi.</span></div><div><br></div><div>The download contains the input to and output from each parser, in Penn Treebank and CONLL formats. </div><div>The ANC project is in the process of generating a version of these annotations in standoff GrAF</div><div>format so that they may be combined with other OANC annotations using the ANC2Go web<br></div><div>application <a href="http://www.anc.org:8080/ANC2Go">http://www.anc.org:8080/ANC2Go</a>) or the stand-alone ANCTool.<br></div><div><br></div><div><div> ********************************************************************************<br></div><div> Manually-generated coreference annotations of 128K words of the Open ANC<br></div><div> ********************************************************************************<br></div><div><span class="Apple-style-span" style="line-height: 19px; "><a href="http://www.cs.ualberta.ca/people/profile.php?who=95362" style="text-decoration: none; "><font class="Apple-style-span" color="#080808">Shane Bergsma</font></a></span><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808"> of the </font></span><span class="Apple-style-span" style="line-height: 19px; "><a href="http://www.cs.ualberta.ca/" style="text-decoration: none; "><font class="Apple-style-span" color="#080808">University of Alberta</font></a></span><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808"> has annotated a sub-set of the Slate journal data for coreference </font></span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">(anaphora). The annotations consist of pronoun-antecedent pairs in 118 documents (128717 words) from </font></span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">the </font></span><span class="Apple-style-span" style="line-height: 19px; "><a href="http://www.anc.org/FirstRelease/contents.html#slate" style="text-decoration: none; "><font class="Apple-style-span" color="#080808">Slate data</font></a></span><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808"> of the ANC/OANC. The data include a test set and a training set; there are 1398 labeled </font></span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">pronouns in 78 documents in the training set and 1381 labeled pronouns in 40 documents in the test set.</font></span></div></div><div><font class="Apple-style-span" color="#080808"><br></font></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">At present these annotations are provided as a separate corpus in the </font><a href="http://www.anc.org/FirstRelease/encoding.html" style="text-decoration: none; "><font class="Apple-style-span" color="#080808">standoff XCES format</font></a><font class="Apple-style-span" color="#080808"> used for the ANC First </font></span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">and Second </font></span><span class="Apple-style-span" style="color: rgb(8, 8, 8); line-height: 19px; ">releases and the current version of the OANC (a release of he OANC in GrAF format, which will supersede the </span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">current XCES format, will be available </font></span><span class="Apple-style-span" style="color: rgb(8, 8, 8); line-height: 19px; ">at the end of this month). A GrAF version of the coreference annotations</span></div><div><span class="Apple-style-span" style="color: rgb(8, 8, 8); line-height: 19px; ">is also being produced.</span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808"><br></font></span></div><div><span class="Apple-style-span" style="line-height: 19px; "><font class="Apple-style-span" color="#080808">All annotations of the OANC are available at <a href="http://www.anc.org/annotations.html">http://www.anc.org/annotations.html</a></font></span></div><div><br></div><div>------------------------------------------------------------------------------------------------------------------------------</div><div>The ANC welcomes contributions of annotations, texts, and derived data, which we release for<br></div><div>free download by the community from our website. ANC, OANC, and MASC data and annotations are<br></div><div>or will be also available through the Linguistic Data Consortium. To contribute, send email to<br></div><div><a href="mailto:anc@anc.org">anc@anc.org</a> or consult <a href="http://www.anc.org/contribute.html">http://www.anc.org/contribute.html</a>.</div><br><div>==============================================================================<br></div><div>THE ANC PROJECT IS COMMITTED TO OPEN DATA FOR LANGUAGE RESEARCH, DEVELOPMENT,<br></div><div>AND EDUCATION. ALL CONTRIBUTIONS OF BOTH DATA AND ANNOTATIONS SHOULD BE<br></div><div>UNENCUMBERED BY LICENSING RESTRICTIONS. ALL CONTRIBUTIONS ARE MADE FREELY AVAILABLE<br></div><div>FOR USE BY THE COMMUNITY.<br></div><div>===============================================================================</div></div><div><br></div><b>NOTE: The reference link for the BBN Named Entity tagger given in the Nov. 9 notice concerning the release</b><div><b>of OANC annotations was incorrect. The correct link is <a href="http://www.aclweb.org/anthology-new/N/N04/N04-1043.pdf">http://www.aclweb.org/anthology-new/N/N04/N04-1043.pdf</a>.</b></div></body></html>