[Corpora-List] Translation evaluation using word alignment

Sérgio Matos aleixomatos at ua.pt
Tue Mar 9 12:26:14 UTC 2010


Hi,
Not my domain at all, but could some form of dynamic programming, on top of
the lexical alignment, work?
I'm imagining that you can always relax the alignment constraints (in DP) to
account for different word order in the source and target languages. 

Regards,
Sérgio




-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Emmanuel Prochasson
Sent: 09 March 2010 09:36
To: corpora at uib.no
Subject: Re: [Corpora-List] Translation evaluation using word alignment

On 03/09/2010 05:06 PM, Alberto Simões wrote:
> Dear Emmanuel
>
> Probably not good enough for your needs, but my experiment with NATools
> was, after obtaining a decent probabilistic translation dictionary
> (using any kind of parallel corpora you can find) use that probabilities
> to measure the likeliness of two sentences being parallel.
>
> How did I measure it... searching for each word on the S(ource)
> L(anguage) and checking if a translation is present in the T(arget)
> L(anguage), and geting the average of the probabilities. Then, same
> approach from TL to SL.
>
> Not fancy, but gave some interesting results.
>    


I actually use a similar approach to find some good candidates (but I 
need to filter them). Instead of using a probabilistic dictionary 
computed from a parallel corpus, I use a regular lexicon.

The results are interesting, but typically, it won't be able to see a 
difference between
"Jon appeared on TV" and
"TV appeared on Jon" (and any translation, say, for example in French: 
"Jon est passé à la TV").

Both sentence will perfectly match the French translation. I need to go 
a bit deeper than lexicon level.

In the first case, I wish to obtain something like :
Jon/Jon est passé/appeared à la/on TV/TV => 100% match
in the second case:
Jon/NULL est passé/appeared à la/on TV/NULL => 50% match

(I'm aware than in such a case, any alignment algorithm is likely to be 
confused, but this is just an illustration).

Regards,

-- 
Emmanuel


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list