[Corpora-List] European Constitution in parallel

Joerg Tiedemann tiedeman at let.rug.nl
Sun Apr 24 23:03:20 UTC 2005


The EU constitution is now part of OPUS parallel corpus.
21 languages, aligned at the sentence level!


download: http://logos.uio.no/opus/EUconst.html
query: http://logos.uio.no/cgi-bin/opus/opuscqp.pl?corpus=EUconst


Everything is machine annotated & automatically aligned. Tokenization, 
sentence splitting, alignment are not 100% correct ...

The query engine is the corpus work bench. There are some problems in 
cases where a conversion from UTF-8 to ISO-8859 wasn't possible. Sorry 
for that.

The source files are taken from:
http://europa.eu.int/eur-lex/lex/XX/treaties/dat/12004V/htm/12004V.html
(replace 'XX' with language codes such as 'en', 'de', ...)



Jörg

***********/\/\/\/\/\/\/\/\/\/\/\************************************
**  Jörg Tiedemann                 tiedeman at let.rug.nl             **
**  Alfa-Informatica               http://www.let.rug.nl/~tiedeman **  
**  Rijksuniversiteit Groningen     Harmoniegebouw, room 1311-429  **
**  Oude Kijk in 't Jatstraat 26    phone: +31 (0)50-363 5935      **
**  9712 EK Groningen               fax:   +31 (0)50-363 6855      **
*************************************/\/\/\/\/\/\/\/\/\/\/\**********



More information about the Corpora mailing list