[Corpora-List] Date: Wed, 11 Sep 2002 15:16:20 +0200

Jean Veronis Jean.Veronis at up.univ-mrs.fr
Wed Sep 11 16:31:31 UTC 2002


At 15:22 11/09/2002 +0200, maria_rzewuska at mail.ukie.gov.pl wrote:
>Hi, I have been reading the list for a while and lately I took a closer
>look at some bilingual corpus projects and I noticed a relatively flexible
>use of terms: translation corpus, parallel corpus, comaparable corpus, but
>mainly between the two first. Maybe someone could tell me is there any
>difference or is it simply mixed up. In the composition of the corpora I
>did not find any difference which could explain the terminological
>difference. Any book or clever article that I should read?
>thanks

The terminology is used in different ways by different groups of people. 
The situation is so confusing that I had to include the following foreword 
in my book:

Véronis, J. (Ed.). (2000). Parallel Text Processing: Alignment and use of 
translation corpora. Dordrecht: Kluwer Academic Publishers.

http://www.up.univ-mrs.fr/veronis/parallel-book.html

------------------------

Terminological note

As the book was in its final writing stages, Alan Melby made us aware of a 
terminological difficulty concerning the expression parallel text. This 
term is well established within the computational linguistics community, as 
witnessed by its consistent use throughout this book and in the numerous 
publications listed in the bibliography, where it refers to texts 
accompanied by their translation in one or several other languages. It is 
used in a different way among the translation theory and terminology 
circles, where it means texts in different languages and in the same 
domain, but not necessarily being translations of each other (the 
computational linguistics community uses the term comparable for such texts).

We were therefore faced with a dilemma: either change the title of the 
book--and the terminology used in all the chapters--and risk a complete 
lack of understanding from the computational linguistics community, or stay 
with the usage of the term established by computational linguists and risk 
severe criticism from translation theorists and terminologists.

We decided for the latter since, after all, computational linguists are 
likely to make up the main readership of the book. Hopefully, this 
terminological note will suffice to clarify matters.



More information about the Corpora mailing list