<span class="Apple-style-span" style="font-family: arial, sans-serif; font-size: 13px; background-color: rgb(255, 255, 255); "><div>We are very pleased to announce that version 0.9 of the Icelandic Parsed Historical Corpus (IcePaHC) is now available for free download. </div>
<div><br></div><div>The corpus can be downloaded from:</div><div><a href="http://www.linguist.is/icelandic_treebank/Download" target="_blank" style="color: rgb(87, 151, 176); ">www.linguist.is/icelandic_treebank/Download</a></div>
<div><br></div><div>The corpus is a treebank of over 1 million words in size, annotated for full phrase structure parse, and hand-corrected, using an adaptation of the annotation scheme used by the Penn Treebank and the Penn parsed corpora of historical English (<a href="http://www.ling.upenn.edu/hist-corpora/" target="_blank" style="color: rgb(87, 151, 176); ">http://www.ling.upenn.edu/hist-corpora/</a>). Note that this release contains all of the text for version 1.0, but some minor corrections remain to be finished.</div>
<div><br></div><div>The corpus contains:</div><div><br></div><div>- 1 002 361 words total, consisting of ~100 000-word samples from each century from the 12th to the beginnng of the 21st century.</div><div>- Annotated with a phrase structure parse, part-of-speech-tagged, and lemmatized.</div>
<div>- The entire parse, pos-tagging, and lemmata for every sentence have been *hand-corrected*.</div><div>- Text samples are balanced for genre within each century.</div><div>- LGPL license: You are free to copy, modify and redistribute the corpus for research and/or profit with appropriate citation.</div>
<div><br></div><div>The corpus is distributed as raw UTF-8 data in labeled bracketing format and it is therefore compatible with various existing programs, including CorpusSearch (<a href="http://corpussearch.sourceforge.net/" target="_blank" style="color: rgb(87, 151, 176); ">http://corpussearch.sourceforge.net/</a>). </div>
<div><br></div><div>A plain text version without markup and a set of info files containing philological information accompany the corpus download.</div><div><br></div><div>The entire corpus may be downloaded in a plain text version, a platform-independent GUI, and a Windows-compatible GUI for ease of searching.</div>
<div><br></div><div>Further information on the annotation guidelines and project organization can be found on the project wiki:</div><div><a href="http://www.linguist.is/icelandic_treebank/" target="_blank" style="color: rgb(87, 151, 176); ">www.linguist.is/icelandic_treebank/</a></div>
<div><br></div><div><br></div><div><span class="il" style="background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: rgb(207, 223, 229); color: rgb(82, 81, 81); background-position: initial initial; background-repeat: initial initial; ">Joel</span> C. Wallenberg (<a href="mailto:joel.wallenberg@gmail.com" target="_blank" style="color: rgb(87, 151, 176); "><span class="il" style="background-image: initial; background-attachment: initial; background-origin: initial; background-clip: initial; background-color: rgb(207, 223, 229); color: rgb(82, 81, 81); background-position: initial initial; background-repeat: initial initial; ">joel</span>.wallenberg@gmail.com</a>)</div>
<div>Anton Karl Ingason (<a href="mailto:anton.karl.ingason@gmail.com" target="_blank" style="color: rgb(87, 151, 176); ">anton.karl.ingason@gmail.com</a>)</div><div>Einar Freyr Sigurđsson (<a href="mailto:einarfs@gmail.com" target="_blank" style="color: rgb(87, 151, 176); ">einarfs@gmail.com</a>)</div>
<div>Eiríkur Rögnvaldsson (<a href="mailto:eirikur@hi.is" target="_blank" style="color: rgb(87, 151, 176); ">eirikur@hi.is</a>)</div><div>University of Iceland</div><div><br></div><div>We were grateful to receive support for this project through the following grants:</div>
<div><br></div><div>Icelandic Research Fund (RANNÍS), grant nr. 090662011,"Viable Language Technology beyond English – Icelandic as a test case".</div><div><br></div><div>U.S. National Science Foundation (NSF) International Research Fellowship Program (IRFP), grant #OISE-0853114, "Evolution of Language Systems: a comparative study of grammatical change in Icelandic and English".</div>
<div><br></div><div>University of Iceland Research Fund (Rannsóknasjóđur Háskóla Íslands), grant Icelandic Diachronic Treebank (Sögulegur íslenskur trjábanki)</div></span><div><br></div><br><br>