Corpora: annotated OE corpus available

Susan Pintzuk sp20 at york.ac.uk
Thu Sep 6 16:00:16 UTC 2001


      The York-Helsinki Parsed Corpus of Old English Poetry

The York-Helsinki Parsed Corpus of Old English Poetry (henceforth the York
Poetry Corpus) is now publicly available. The York Poetry Corpus is a
selection of poetic texts from the Old English Section of the Helsinki
Corpus of English Texts (henceforth the Helsinki Corpus), annotated to
facilitate searches on lexical items and syntactic structure. It is
intended for the use of students and scholars of the history of the
English language. The York Poetry Corpus contains 71,490 words of Old
English text; the samples from the longer texts are 4,000 to 17,000 words
in length. The texts represent a range of dates of composition and
authors. The size of the corpus is approximately 2.5 megabytes.

The York Poetry Corpus was funded by ESRC grant R000222434, whose support
is gratefully acknowledged. The annotation scheme was developed by Susan
Pintzuk, Ann Taylor, Anthony Warner, Leendert Plug, and Frank Beths. The
scheme was based on the one developed at the University of Pennsylvania
for the second edition of the Penn-Helsinki Parsed Corpus of Middle
English, and it is the same as the one used for the York-Helsinki Parsed
Corpus of Old English (under construction at the University of York). Our
intent was to make the syntactic annotation of the three corpora as
similar as possible, while taking into account the syntactic and
morphological differences between Old and Middle English and between
poetry and prose.

The York Poetry Corpus is available without fee for educational and
research purposes, but it is not in the public domain. More information
about the York Poetry Corpus and how to access it is available at
http://www-users.york.ac.uk/~lang18/pcorpus.html.  Viewing the manuals
on-line is unrestricted, but the texts themselves are available only to
users who agree formally to the conditions of use by filling out the
access request form and returning it via e-mail to Susan Pintzuk
(sp20 at york.ac.uk). The York Poetry Corpus will soon be available through
the Oxford Text Archive (http://ota.ahds.ac.uk).


Susan Pintzuk
Department of Language and Linguistic Science
University of York
Heslington, York YO10 5DD
sp20 at york.ac.uk
Telephone: +44 1904 432661
Fax: +44 1904 432673



More information about the Corpora mailing list