Corpora: annotated OE corpus available

Susan Pintzuk sp20 at york.ac.uk
Mon Sep 10 17:19:40 UTC 2001


Apologies for duplicate postings.


The York-Helsinki Parsed Corpus of Old English Poetry

The York-Helsinki Parsed Corpus of Old English Poetry
(henceforth the York Poetry Corpus) is now publicly
available. The York Poetry Corpus is a selection of poetic
texts from the Old English Section of the Helsinki Corpus
of English Texts (henceforth the Helsinki Corpus),
annotated to facilitate searches on lexical items and
syntactic structure. It is intended for the use of
students and scholars of the history of the English
language. The York Poetry Corpus contains 71,490 words of
Old English text; the samples from the longer texts are
4,000 to 17,000 words in length. The texts represent a
range of dates of composition and authors. The size of the
corpus is approximately 2.5 megabytes.

The York Poetry Corpus was funded by ESRC grant
R000222434, whose support is gratefully acknowledged. The
annotation scheme was developed by Susan Pintzuk, Ann
Taylor, Anthony Warner, Leendert Plug, and Frank Beths,
and implemented by Leendert Plug. The scheme was based on
the one developed at the University of Pennsylvania for
the second edition of the Penn-Helsinki Parsed Corpus of
Middle English, and it is the same as the one used for the
York-Helsinki Parsed Corpus of Old English (under
construction at the University of York). Our intent was to
make the syntactic annotation of the three corpora as
similar as possible, while taking into account the
syntactic and morphological differences between Old and
Middle English and between poetry and prose.

The York Poetry Corpus is available without fee for
educational and research purposes, but it is not in the
public domain. More information about the York Poetry
Corpus and how to access it is available at http://www-
users.york.ac.uk/~lang18/pcorpus.html. Viewing the manuals
on-line is unrestricted, but the texts themselves are
available only to users who agree formally to the
conditions of use by filling out the access request form
and returning it via e-mail to Susan Pintzuk
(sp20 at york.ac.uk). The York Poetry Corpus will soon be
available through the Oxford Text Archive
(http://ota.ahds.ac.uk).


Susan Pintzuk
Department of Language and Linguistic Science
University of York
Heslington, York YO10 5DD
sp20 at york.ac.uk
Telephone: +44 1904 432661
Fax: +44 1904 432673



More information about the Corpora mailing list