Corpora: latin corpus

Christian Saam saam2801 at uni-trier.de
Fri Jan 26 15:36:58 UTC 2001


Dear Reader,

I'm currently looking for a corpus of Latin for my master's thesis on
inflectional morphology.

My ideal corpus looks like this:

size: 2 million running words

annotations (per word form):grammatical features (for all possible
readings), DISAMBIGUATION of readings, conjugation/declension class
(paradigm), lemma OR anything (thereof) that gives me enough clues to
get to a disambiguated reading of every word form

What I've come across so far:

The Corpus Augustinianum Gissense which is only lemmatized, but due to
its restricted interface not even that can be exploited for my purposes.

The texts in the Perseus collection at Tufts University seem to be worth
while looking at. But even though for all of the words an analysis can
be looked up ambiguities are never resolved. (And I don't expect to come
up with a quick enough solution to the resolution problem with respect
to the (relatively) free word order.)

The wordcruncher collection at the TITUS site in Frankfurt doesn't seem
to contain any latin texts.

The Thesaurus Linguae Latinae server at University of Saskatchewan
couldn't be reached via any of the links I found.



More information about the Corpora mailing list