[Corpora-List] Parallel corpora that are directly searchable on the web?

Joerg Tiedemann jorg.tiedemann at lingfil.uu.se
Mon Nov 8 17:39:08 UTC 2010


OPUS:
http://www.let.rug.nl/tiedeman/OPUS/
http://www.let.rug.nl/tiedeman/OPUS/bin/opuscqp.pl

LinearB
http://linearb.co.uk/

MyMemories
http://mymemory.translated.net/




-- 
**********************************************************************************
 Jörg Tiedemann                                     jorg.tiedemann at lingfil.uu.se
 Dep. of Linguistics and Philology
http://stp.lingfil.uu.se/~joerg/
 Uppsala University                                  tel:  +46 (0)18 - 471 1412
 Box 635, SE-751 26 Uppsala/SWEDEN   fax: +46 (0)18 - 471 1094


e Roncal <i.sanvicente at elhuyar.com> wrote:
> http://corpus.consumer.es/corpus/ - Eroski Consumer magazine corpus
> includes parallel texts in Spanish, Basque, Galizian and Catalan
> languages, approximately 2.5M words per language (3.7M in the case of
> Spanish).
>
>
> Regards,
> Iñaki.
>
>
> .......................................................
>
>
>
> Iñaki San Vicente
> Hizkuntza Zerbitzuak - I+G+B
> Elhuyar Fundazioa
> Zelai Haundi, 3
> Osinalde industrialdea
> 20170 Usurbil
> tel.: 943363040
> www.elhuyar.org
>
>
>
>
>
>
> 2010/11/8 Wallace Chen <juiching.chen at gmail.com>
>>
>> Any links for the English-Chinese pair? Thanks!
>>
>> Wallace Chen
>> Monterey Institute of International Studies
>> U.S.A.
>>
>> On 11/8/2010 6:38 AM, R.M.Salkie at bton.ac.uk wrote:
>>
>> I have found these web sites where you can search for a word or phrase in one language and the output is a bilingual or multilingual concordance.
>>
>>
>>
>> Can anyone add to the list?
>>
>>
>>
>> Thanks. – Raphael Salkie, School of Humanities, University of Brighton, UK.
>>
>>
>>
>>
>>
>> 1. CABAL: Un concordancier en ligne pour la linguistique contrastive
>>
>> http://cabal.rezo.net/ (University of Poitiers)
>>
>> English, French
>>
>> Environ 200 articles sont actuellement en ligne (soit environ 400 000 mots). La majorité sont issus du Monde diplomatique et datés de 1998 à décembre 2003.
>>
>>
>>
>> 2.  The CLUVI corpus:
>>
>> http://sli.uvigo.es/CLUVI/index_en.html
>>
>> English, French, Spanish, Galician,
>>
>> Corpus:  UNESCO Corpus of English-Galician-French-Spanish scientific-technical divulgation
>>
>>
>>
>> 3. German(-English) parallel corpora (Europarl and German News)
>>
>> http://corpus.leeds.ac.uk/paraquery.html
>>
>> English, German
>>
>>
>>
>> 4. WebTCE (Translation Corpus Explorer)
>>
>> http://khnt.hit.uib.no/webtce.htm
>>
>> English, German, French, Spanish, Norwegian, Danish
>>
>>
>>
>> 5. EVROKORPUS Parallel corpora
>>
>> http://evrokorpus.gov.si/index.php?jezik=angl
>>
>> 223 million words. English, French, German, Italian, Slovene and Spanish. Searches must involve Slovene and one other language.
>>
>>
>>
>> 6. TERMACOR terminology and corpus
>>
>> http://evrokorpus.gov.si/k2/index.php?jezik=angl
>>
>> 98 million words in 22 European Languages.  EU Commission data.
>>
>>
>>
>> 7. COMPARA Portuguese-English parallel corpus
>>
>> http://www.linguateca.pt/COMPARA/
>>
>> Three million words.
>>
>> Portuguese, English
>>
>>
>>
>> 8. Termsearch
>>
>> http://www.termsearch.info/ or a faster interface at:
>>
>> http://www.bible-study-in-geneva.info/termsearch/
>>
>> English, French, Russian
>>
>> Major international treaties, conventions, agreements, etc. 792 documents.
>>
>>
>>
>> 9. English-Inuktitut Parallel Corpus
>>
>> http://www.inuktitutcomputing.ca/NunavutHansard/en/
>>
>> 3.5 million words (of English), 1.5 million words of Inuktitut
>>
>> English, Inuktitut (an Inuit Language of North-Eastern Canada)
>>
>>
>>
>> 10. English-Russian Parallel Corpus
>>
>> http://ruscorpora.ru/search-para.html
>>
>> English, Russian, (some German?)
>>
>> Interface only in Russian.
>>
>> About 9 million words
>>
>>
>>
>> ___________________________________________________________
>> This email has been scanned by MessageLabs' Email Security
>> System on behalf of the University of Brighton.
>> For more information see http://www.brighton.ac.uk/is/spam/
>> ___________________________________________________________
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



-- 
**********************************************************************************
Jörg Tiedemann                                 http://stp.lingfil.uu.se/~joerg/

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list