[Corpora-List] WIT^3: a new collection of parallel texts!

Marcello Federico federico at fbk.eu
Fri Jun 1 07:29:46 UTC 2012


We are very glad to announce WIT3 (to be read "wit cube"), acronym standing for Web Inventory of Transcribed and Translated Talks, a ready-to-use version of TED talks (wit3.fbk.eu<http://wit3.fbk.eu/>).

TED is a non profit organization that invites experts to give the talk of their lives. TED records all talks and post them on its website (www.ted.com<http://www.ted.com/>). Currently, more than 1100 talks are listed there, all subtitled in English. Translations of transcripts are also available into many languages (up to 90).

To make this collection of parallel texts more effectively usable by the MT research community, we have developed WIT3, a website hosting this multilingual corpus of talks, aligned at sentence level, alongside benchmarks, processing tools and reference MT results.

We hope WIT3 will offer an adequate service to the research community; for getting more info and downloading data, please visit: wit3.fbk.eu<http://wit3.fbk.eu/>.

Marcello Federico
FBK-HLT, Trento, Italy



_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list