English-Spanish Training Corpora for Machine Translation

Carlos Subirats carlos.subirats at GMAIL.COM
Sun Nov 25 22:51:30 UTC 2007


-------------------          INFOLING          --------------------
Lista de distribución sobre lingüí­stica del español (ISSN: 1576-3404):  http://elies.rediris.es/infoling/
Enví­o de información: infoling-request at listserv.rediris.es
EDITORES:
Carlos Subirats Rüggeberg, UAB <carlos.subirats at uab.es>
Mar Cruz Piñol, U. Barcelona <mcruz at ub.edu>
Eulalia de Bobes Soler, U. Abat Oliba-CEU <debobes1 at uao.es>
Equipo de edición: http://elies.rediris.es/infoling/editores.html
Estudios de Lingüí­stica del Español (ELiEs): http://elies.rediris.es
es una red temática de lingüística del español asociada a INFOLING.
---------------------------------------------------------------------

      INFOLING: una lista de distribución independiente y plural
 © Infoling, Barcelona (España) 1998-2007. Reservados todos los derechos

--------------------------------------------------------------------------------------------------------
English-Spanish Training Corpora for Machine Translation
Distribuido y comercializado por European Language Resources Association (ELRA)
Información técnica y precios de venta:
http://catalog.elra.info/product_info.php?products_id=1033
--------------------------------------------------------------------------------------------------------

TC-STAR English-Spanish Training Corpora for Machine Translation:
Aligned Final Text Editions of European Parliament Plenary Sessions
(EPPS)

TC-STAR is a European integrated project focusing on all core
technologies for Speech-to-Speech Translation (SST): Automatic Speech
Recognition (ASR), Spoken Language Translation (SLT), and Text to
Speech Synthesis (TTS).

This corpus consists of respectively 34 million (English) and 38
million (Spanish) running words of bilingual sentence segmented and
aligned texts in English and Spanish obtained from the Final Text
Editions provided by the European Parliament
(http://www.europarl.europa.eu) from April 1996 to Sept. 2004, Dec.
2004 to May 2005, and Dec. 2005 to May 2006. The data is accompanied
by tools for further preprocessing.

Distribution medium : CD-ROM
	
PRICES:

ELRA Members Prices 	
Academic:    3.000 EUR
Commercial: 4.250 EUR

Non ELRA Member Prices 	
Academic:    3.925 EUR
Commercial: 5.600.EUR

----------------------------------------------------------------------

Wikipedia en latín:
Vicipaedia Latina: http://la.wikipedia.org/wiki/Pagina_prima
Número total de artículos: 15.000

----------------------------------------------------------------------



More information about the Infoling mailing list