[Corpora-List] Translation evaluation using word alignment

Alberto Simões albie at alfarrabio.di.uminho.pt
Tue Mar 9 09:06:03 UTC 2010


Dear Emmanuel

Probably not good enough for your needs, but my experiment with NATools
was, after obtaining a decent probabilistic translation dictionary
(using any kind of parallel corpora you can find) use that probabilities
to measure the likeliness of two sentences being parallel.

How did I measure it... searching for each word on the S(ource)
L(anguage) and checking if a translation is present in the T(arget)
L(anguage), and geting the average of the probabilities. Then, same
approach from TL to SL.

Not fancy, but gave some interesting results.

Cheers

On 09/03/2010 05:15, Emmanuel Prochasson wrote:
> Dear all,
> 
> I am looking for a way to evaluate the likeliness of two sentences being
> parallel. I found many tools to perform sentence/word alignment (GIZA++,
> uplug, NATool), in order to /train/ a statistical translation model, or
> find a lexicon.
> 
> My problem is somehow different : I don't want to learn an alignment
> from a pair of parallel sentences (or documents), I want to see if two
> sentences are actually parallel by using a previously computed model and
> see if it can fit.
> 
> Training the statistical model will help to learn p(f|e), probability of
> /f/ being the translation of /e/, then a MT system will find, for a
> given /e/, the /f/ that maximize the previous probability. My problem is
> simpler: I want to be able to find p(f|e) immediately. I think one way
> of doing that is to see if I can find a good word-alignment between the
> two sentences.
> 
> Do you know of any tool for that, or do I need to implement one, using
> Viterbi's algorithm for example ?
> 
> Thank you,
> 

-- 
Alberto Simões

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list