Corpora: non-english corpora

jre at comp.leeds.ac.uk jre at comp.leeds.ac.uk
Fri Jun 1 10:30:22 UTC 2001


Greetings all

I am holding out my begging bowl again!  I am trying to find non-english
PoS-TAGGED corpora, which can be a little as a few thousand words.  I am ideally looking for such languages as Arabic, Hindi, Russian, Basque, Spanish, Vietnamese, Latin and even Sanskrit.  Any of these or similar would be most welcome.

Ever hopefull..

John
********************************************************
John Elliott
Centre for Computer Analysis of Language and Speech
University of Leeds
email: jre at scs.leeds.ac.uk
phone: 0113 233 6827
Web-site http://www.scs.leeds.ac.uk/jre
********************************************************



More information about the Corpora mailing list