[Corpora-List] Google's translations

Jimmy O'Regan joregan at gmail.com
Sun Mar 14 23:37:17 UTC 2010


On 11 March 2010 13:18, Peter Kolb <pekoli at gmail.com> wrote:
> 3. Another interesting experiment is to let Google translate the German word
> "Ufer" (meaning "bank", but only in the waterside sense) into Czech. This
> gives "banky", which means "bank", but only in its financial sense. This can
> be explained by the observation that Google always uses English as
> interlingua (Ufer --> bank --> banky). If you directly translate e.g.
> Spanish to French you will get exactly the same result as when you first
> translate Spanish into English, and then translate the English output into
> French.
> Obviously, even for Google it is too costly to generate and maintain 52 * 51
> = 2651 translation models for all the supported language pairs. Or is it
> that they have found that X to English to Y always performs better than X to
> Y because there is so much more data available between English and X or Y
> than between X and Y?

Improving Word Alignment with Bridge Languages, Shankar Kumar, Franz
Och, Wolfgang Macherey, Conference on Empirical Methods in Natural
Language Processing and Computational Natural Language Learning, 2007.
http://www.aclweb.org/anthology-new/D/D07/D07-1005.pdf

'   We show that parallel corpora in multiple lan-
guages can be exploited to improve the translation
performance of a phrase-based translation system.
This paper gives specific recipes for using a bridge
language to construct a word alignment and for com-
bining word alignments produced by multiple statis-
tical alignment models.'

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list