[Corpora-List] Translation evaluation using word alignment
Chris Dyer
redpony at umd.edu
Tue Mar 9 13:21:38 UTC 2010
Hi Emmanuel,
The decoder that I developed to do my thesis research can run any of
its supported translation models "in alignment mode", making it
possible to get out, for example, a Viterbi or inside score for a
sentence pair under an existing model. However, be advised that the
process will fail if the specified model cannot generate the sentence
pair. But, if you pick a simple enough model (lexical translation,
say) and you handle OOV words in some reasonable way (like mapping
them all to some special token), it will probably work for any
sentence. Finally, unless you want to explore the most undocumented
features of this software, you will want to generate the model you use
using existing tools like Giza++, Joshua, or Moses.
The code and some preliminary documentation available publicly here:
http://cdec-decoder.org
-Chris
On Tue, Mar 9, 2010 at 12:15 AM, Emmanuel Prochasson
<emmanuel.prochasson at univ-nantes.fr> wrote:
> Dear all,
>
> I am looking for a way to evaluate the likeliness of two sentences being
> parallel. I found many tools to perform sentence/word alignment (GIZA++,
> uplug, NATool), in order to /train/ a statistical translation model, or find
> a lexicon.
>
> My problem is somehow different : I don't want to learn an alignment from a
> pair of parallel sentences (or documents), I want to see if two sentences
> are actually parallel by using a previously computed model and see if it can
> fit.
>
> Training the statistical model will help to learn p(f|e), probability of /f/
> being the translation of /e/, then a MT system will find, for a given /e/,
> the /f/ that maximize the previous probability. My problem is simpler: I
> want to be able to find p(f|e) immediately. I think one way of doing that is
> to see if I can find a good word-alignment between the two sentences.
>
> Do you know of any tool for that, or do I need to implement one, using
> Viterbi's algorithm for example ?
>
> Thank you,
>
> --
> Emmanuel
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list