[Corpora-List] BILINGUAL PARALLEL CORPORA

Olivier Kraif olivier.kraif at tele2.fr
Tue Nov 14 20:53:11 UTC 2006


Dear J.L.,

another parallel corpus concerning latin languages is available on the 
website of the Carmel Project (www.projetcarmel.org) : these are travel 
stories of the 19th  and early 20th century (Darwin, Loti, Stendhal, 
Flaubert, Dickens, London, etc.), translated in English, French, Italian 
and Spanish.
The corpus can be queried online, but it is also possible to download 
some texts (all the original texts, and some translations that are old 
enough !), with the alignment files. The texts are pos-tagged...
The website is still in construction and some data are not yet available 
: I think it should be complete within one month.
You can find a review of links on my page : 
http://w3.u-grenoble3.fr/kraif/index.php?option=com_content&task=view&id=20&Itemid=36

If you need a tool to process the bilingual corpus (aligning at sentence 
or word level, editing, searching, etc.) I have also put a free software 
online (under Windows only):
Alinea : 
http://w3.u-grenoble3.fr/kraif/index.php?option=com_content&task=view&id=27&Itemid=43 

(the latest version is not yet available, but previous ones can be 
downloaded).

Regards

Olivier






> Dear Corpora-List members,
>  
> I have three questions...
>  
> Does anyone know if there is any publicly available bilingual, 
> sentence aligned, freely available corpus involving several languages, 
> namely in Scandinavian (Finnish, Norwegian, etc.) or Latin languages 
> (Spanish, Italian, etc.), for bilingual studies ?
>  
> My second question is: Which would be the requirements to create an 
> online/desktop software tool for the whole process of a parallel corpora?
>  
> Finally, do you should consider one million of words (in both 
> languages) a large or a little bilingual corpus?
>  
> Any help will be appreciated.
>  
>  
> Regards,
>  
>  
> J. L. DeLucca (in some place of Spain)
>  
>
> ------------------------------------------------------------------------
> Access over 1 million songs - Yahoo! Music Unlimited. 
> <http://pa.yahoo.com/*http://us.rd.yahoo.com/evt=36035/*http://music.yahoo.com/unlimited/> 



More information about the Corpora mailing list