[Corpora-List] European Constitution in parallel
Joerg Tiedemann
tiedeman at let.rug.nl
Sun Apr 24 23:03:20 UTC 2005
The EU constitution is now part of OPUS parallel corpus.
21 languages, aligned at the sentence level!
download: http://logos.uio.no/opus/EUconst.html
query: http://logos.uio.no/cgi-bin/opus/opuscqp.pl?corpus=EUconst
Everything is machine annotated & automatically aligned. Tokenization,
sentence splitting, alignment are not 100% correct ...
The query engine is the corpus work bench. There are some problems in
cases where a conversion from UTF-8 to ISO-8859 wasn't possible. Sorry
for that.
The source files are taken from:
http://europa.eu.int/eur-lex/lex/XX/treaties/dat/12004V/htm/12004V.html
(replace 'XX' with language codes such as 'en', 'de', ...)
Jörg
***********/\/\/\/\/\/\/\/\/\/\/\************************************
** Jörg Tiedemann tiedeman at let.rug.nl **
** Alfa-Informatica http://www.let.rug.nl/~tiedeman **
** Rijksuniversiteit Groningen Harmoniegebouw, room 1311-429 **
** Oude Kijk in 't Jatstraat 26 phone: +31 (0)50-363 5935 **
** 9712 EK Groningen fax: +31 (0)50-363 6855 **
*************************************/\/\/\/\/\/\/\/\/\/\/\**********
More information about the Corpora
mailing list