[Corpora-List] Tool for raw parallel corpora alignment

Emmanuel Prochasson emmanuel.prochasson at univ-nantes.fr
Tue Mar 17 09:09:03 UTC 2009


Thank you all for your quick and accurate answer, I'll have a look to 
all the tools you provided me.

Emmanuel Prochasson a écrit :
> Dear all,
>
> I am looking for a tool to perform word-level alignment on a raw 
> parallel corpora. That is, given two text that are translations of each 
> other, output a word-level alignment output (my goal is to use this word 
> alignment output to quickly build a bilingual lexicon).
>
> I found and tried many softwares, but met several difficulties :
> - most of them process already aligned documents (and require another 
> tool to perform sentence alignment, which require another tool...). I 
> need one that can process raw text documents
> - a lot of them are really outdated (computer-history speaking) and 
> don't compile well with "modern" C, C++ or Java compiler
>
> I don't really need the best alignment software ever. I need something 
> quite simple, that can be used in a fully automatic process (that means, 
> no windows GUI), even if it has a "low" precision compared to best 
> results obtain by researcher or industry.
>
> Do you have any clue ?
>
> Thanks,
>
>   


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list