[Corpora-List] Date: Wed, 11 Sep 2002 15:16:20 +0200
Jean Veronis
Jean.Veronis at up.univ-mrs.fr
Wed Sep 11 16:31:31 UTC 2002
At 15:22 11/09/2002 +0200, maria_rzewuska at mail.ukie.gov.pl wrote:
>Hi, I have been reading the list for a while and lately I took a closer
>look at some bilingual corpus projects and I noticed a relatively flexible
>use of terms: translation corpus, parallel corpus, comaparable corpus, but
>mainly between the two first. Maybe someone could tell me is there any
>difference or is it simply mixed up. In the composition of the corpora I
>did not find any difference which could explain the terminological
>difference. Any book or clever article that I should read?
>thanks
The terminology is used in different ways by different groups of people.
The situation is so confusing that I had to include the following foreword
in my book:
Véronis, J. (Ed.). (2000). Parallel Text Processing: Alignment and use of
translation corpora. Dordrecht: Kluwer Academic Publishers.
http://www.up.univ-mrs.fr/veronis/parallel-book.html
------------------------
Terminological note
As the book was in its final writing stages, Alan Melby made us aware of a
terminological difficulty concerning the expression parallel text. This
term is well established within the computational linguistics community, as
witnessed by its consistent use throughout this book and in the numerous
publications listed in the bibliography, where it refers to texts
accompanied by their translation in one or several other languages. It is
used in a different way among the translation theory and terminology
circles, where it means texts in different languages and in the same
domain, but not necessarily being translations of each other (the
computational linguistics community uses the term comparable for such texts).
We were therefore faced with a dilemma: either change the title of the
book--and the terminology used in all the chapters--and risk a complete
lack of understanding from the computational linguistics community, or stay
with the usage of the term established by computational linguists and risk
severe criticism from translation theorists and terminologists.
We decided for the latter since, after all, computational linguists are
likely to make up the main readership of the book. Hopefully, this
terminological note will suffice to clarify matters.
More information about the Corpora
mailing list