Corpora: Parallel corpus

Mike Maxwell Mike_Maxwell at sil.org
Mon Dec 18 23:12:32 UTC 2000


Yuliya M. Katsnelson writes:

>I am looking for a parallel corpus (news, etc.) in English
>and optimally, Eastern European languages.

For nearly every written language, there is at least one parallel corpus:
the Bible (or at least the New Testament).  There are obvious shortcomings
with such a source (the alignment is at the verse level, which may be too
broad for some purposes; much of the vocabulary is likely to be in semantic
domains not of wider interest; there are issues of translation style; the
corpus may be too small; etc.).  But it's there, and in many cases should be
available in electronic form, perhaps even on the web.

     Mike Maxwell
     Mike_Maxwell at sil.org



More information about the Corpora mailing list