14.1678, Sum: Parallel Texts for MT Evaluation
LINGUIST List
linguist at linguistlist.org
Fri Jun 13 13:07:46 UTC 2003
LINGUIST List: Vol-14-1678. Fri Jun 13 2003. ISSN: 1068-4875.
Subject: 14.1678, Sum: Parallel Texts for MT Evaluation
Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
Reviews (reviews at linguistlist.org):
Simin Karimi, U. of Arizona
Terence Langendoen, U. of Arizona
Home Page: http://linguistlist.org/
The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.
Editor for this issue: Karen Milligan <karen at linguistlist.org>
==========================================================================
To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.
=================================Directory=================================
1)
Date: Fri, 13 Jun 2003 13:00:20 +0100 (BST)
From: D Elliott <debe at comp.leeds.ac.uk>
Subject: Parallel texts for MT evaluation
-------------------------------- Message 1 -------------------------------
Date: Fri, 13 Jun 2003 13:00:20 +0100 (BST)
From: D Elliott <debe at comp.leeds.ac.uk>
Subject: Parallel texts for MT evaluation
Dear all,
Thanks to everyone who responded to my request for parallel texts with
good quality human translations, suitable for my MT evaluation research
(Linguist 14.1461).
Here is a summary of resources available from the web:
INTERSECT corpus
FRENCH-ENGLISH:
Le Monde, instructions for domestic appliances, technical and academic
texts and others
GERMAN-ENGLISH
Company home pages, news items, EU documents and more
http://www.brighton.ac.uk/edusport/languages/html/intersect.html
Thanks to Professor Raphael Salkie, University of Brighton, UK
Proceedings of the European Parliament
MANY EUROPEAN LANGUAGES INTO ENGLISH
http://www.isi.edu/~koehn/publications/europarl/
Thanks to Susana Sotelo Docío, Universidade de Santiago de Compostela
OPUS corpus
ENGLISH SOURCE TEXTS translated into French, Spanish, Swedish, German,
and Japanese.
Jörg Tiedemann and Lars Nygaaard compiled the documentation of the
office package OpenOffice[1] and the PHP[2] manual. The resulting
corpus is OPUS - an open source parallel corpus.
http://logos.uio.no/opus/
[1] http://www.openoffice.org
[2] http://www.php.net
Thanks to Susana Sotelo Docío, Universidade de Santiago de Compostela
UN declarations of human rights
Many languages
http://www.unhchr.ch/udhr/index.htm
Thanks to Paul McNamee, Johns Hopkins University and Ella Earp-Lynch,
SpeechWorks International
Centre for Disease Control (USA)
Chinese, French, Japanese, Spanish info on SARS and many other medical
topics
http://www.cdc.gov/
http://www.cdc.gov/ncidod/sars/languages.htm
Thanks to Paul McNamee, Johns Hopkins University
Debian free software community:
Technical translations
http://www.debian.org/international/
Thanks to Paul McNamee, Johns Hopkins University
Official journal of the EU
Freely downloadable European legislation in many languages
http://europa.eu.int
Thanks to Paul McNamee, Johns Hopkins University, Terence Lewis
(Language Engineer) and Koen.Kerremans
Public registry of the Council of the EU
PDF files in various languages. Translations indicate the source
language.
http://register.consilium.eu.int/
Thanks to John Beaven
COMPARA corpus
English-Portuguese/Portuguese-English
http://www.linguateca.pt/COMPARA/
Thanks to Dr Ana Frankenberg-Garcia,Instituto Superior de Línguas e
Administração, Lisboa, Portugal
The Universal Declaration of Human Rights
UNESCO's website also has most documents available translated into
Spanish, French and frequently into Russian, Chinese and Arabic
French Foreign Ministry's magazine - Label France:
French into various languages
http://www.france.diplomatie.fr/label_france/index.html
Thanks to Jeremy Whistle, University College Northampton
ELRA newsletter
In French and English
www.elda.fr
Thanks to Jeff Allen
Multilingual articles:
English version:
http://www.multilingual.com/allen51.htm
French translation:
http://www.editionscle.com/bol/presse/article1/allen-mltc51-fr.htm
English version: http://www.multilingual.com/allen53.htm
French translation:
http://www.editionscle.com/bol/presse/article2/allen-mltc53-fr.htm
Thanks to Jeff Allen
Haitian Creole version:
http://hometown.aol.com/mit2haiti/JA-HC-kr.htm
English version:
http://hometown.aol.com/mit2haiti/JA-HC-eng.htm
Thanks to Jeff Allen
MIT2 website
Marilyn Mason Bio & Publication List:
http://hometown.aol.com/marilinc/Index3.html
Creole Links Page:
http://hometown.aol.com/mit2haiti/Index4.html
The Creole Clearinghouse:
http://hometown.aol.com/CreoleCH/Index6.html
Thanks to Jeff Allen
-
***************************************************
Debbie Elliott
Computer Vision and Language Research Group,
School of Computing,
University of Leeds,
Leeds LS2 9JT
United Kingdom.
Website (to be expanded):
http://www.comp.leeds.ac.uk/cgi-bin/sis/ext/rs_pub.cgi/debe.html?cmd=displayrs
---------------------------------------------------------------------------
LINGUIST List: Vol-14-1678
More information about the LINGUIST
mailing list