[Corpora-List] Tool for raw parallel corpora alignment

Emmanuel Prochasson emmanuel.prochasson at univ-nantes.fr
Mon Mar 16 14:26:26 UTC 2009


Dear all,

I am looking for a tool to perform word-level alignment on a raw 
parallel corpora. That is, given two text that are translations of each 
other, output a word-level alignment output (my goal is to use this word 
alignment output to quickly build a bilingual lexicon).

I found and tried many softwares, but met several difficulties :
- most of them process already aligned documents (and require another 
tool to perform sentence alignment, which require another tool...). I 
need one that can process raw text documents
- a lot of them are really outdated (computer-history speaking) and 
don't compile well with "modern" C, C++ or Java compiler

I don't really need the best alignment software ever. I need something 
quite simple, that can be used in a fully automatic process (that means, 
no windows GUI), even if it has a "low" precision compared to best 
results obtain by researcher or industry.

Do you have any clue ?

Thanks,

-- 
Emmanuel

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list