[Corpora-List] Google's translations

Taras Zagibalov T.Zagibalov at sussex.ac.uk
Fri Mar 12 01:14:50 UTC 2010


I checked English - Russian translation of "river bank" and "bank
staff" and in both cases the translation was correct. English -
Chinese translations were also fine. Direct Russian - Chinese
translation also produced correct results.
So p.3 is not probably applicable to all language pairs.

Best regards,
Taras

2010/3/11 Peter Kolb <pekoli at gmail.com>:
> I have three comments:
>
> 1. The text by Kant contains a lot of anaphoric pronouns. From Google's
> translation it is obvious that their system does not perform any pronoun
> resolution (or at least none that works better than a random baseline).
> However, there exist German to English translation engines on the market
> that incorporate such components.
>
> 2. Consider the following extract from Kant's text:
>
> "wo [jedermann]SUBJ, [der sonst in allen übrigen Dingen unwissend
> ist]REL_CL,
> [sich]REFLX [ein entscheidendes Urteil]OBJ [anmaßt]PRED"
>
> A simple relative clause separates subject from object and predicate. The
> completely garbled translation that Google delivers can serve as a textbook
> example to illustrate how n-gram models (even 9-grams in this case) of
> syntax fail to cope with long range dependencies.
>
> 3. Another interesting experiment is to let Google translate the German word
> "Ufer" (meaning "bank", but only in the waterside sense) into Czech. This
> gives "banky", which means "bank", but only in its financial sense. This can
> be explained by the observation that Google always uses English as
> interlingua (Ufer --> bank --> banky). If you directly translate e.g.
> Spanish to French you will get exactly the same result as when you first
> translate Spanish into English, and then translate the English output into
> French.
> Obviously, even for Google it is too costly to generate and maintain 52 * 51
> = 2651 translation models for all the supported language pairs. Or is it
> that they have found that X to English to Y always performs better than X to
> Y because there is so much more data available between English and X or Y
> than between X and Y?
>
> Peter Kolb
>
> ------------------------------------
> Department Linguistik, University of Potsdam
> Karl-Liebknecht-Str. 24-25, D-14476 Golm
> Phone: +49-331-977-2930
> Fax: +49-331-977-2761
> E-Mail: pekoli at gmail.com
> http: www.ling.uni-potsdam.de/~kolb
>
> http://www.linguatools.de
>
> 2010/3/10 John F. Sowa <sowa at bestweb.net>
>>
>> Following is an article from the New York Times about Google's
>> translation service:
>>
>>
>> http://www.nytimes.com/2010/03/09/technology/09translate.html?hpw&pagewanted=all
>>
>> And following is an excerpt:
>>
>>> “What you see on Google Translate is state of the art” in computer
>>> translations that are not limited to a particular subject area,
>>> said Alon Lavie, an associate research professor in the Language
>>> Technologies Institute at Carnegie Mellon University.
>>
>> Following is the Google web page for entering text or the URL of
>> a document to be translated:
>>
>>   http://translate.google.com
>>
>> So I entered one paragraph by Wittgenstein and one by Kant.
>> See below for the results.
>>
>> I discovered that the translations were sensitive to line breaks.
>> For each paragraph, there are two translations:  the first of
>> a "cut and paste" from text files with line breaks; the second
>> of the same paragraphs as displayed by Firefox from html files.
>> The html version eliminated the line breaks in the excerpts
>> copied to Google.
>>
>> Does anyone have any comments or observations about the state
>> of the art?
>>
>> John
>> _________________________________________________________________________
>>
>> >From the Preface to Wittgenstein's Tractatus Logico-Philosophicus:
>>
>> Dagegen scheint mir die Wahrheit der hier mitgeteilten Gedanken
>> unantastbar und definitiv.  Ich bin also der Meinung, die Probleme im
>> Wesentlichen endgültig gelöst zu haben.  Und wenn ich mich hierin nicht
>> irre, so besteht nun der Wert dieser Arbeit zweitens darin, daß sie
>> zeigt, wie wenig damit getan ist, daß die Probleme gelöst sind.
>>
>> First translation from a text file with line breaks:
>>
>> On the other hand seems to me the truth of the thoughts communicated here
>> unassailable and definitive. I am therefore of the opinion that the
>> problems in
>> Have solved essentially. And if I'm not in this
>> mistaken, then, is the value of this work, secondly the fact that they
>> shows how little has been done that the problems are solved.
>>
>> Second translation of the same text entered from an html file:
>>
>> On the other hand seems to me the truth of the thoughts communicated here
>> unassailable and definitive. I am therefore of the opinion that the problems
>> largely been finally solved. And if I am not mistaken, so now is the value
>> of this work, secondly the fact that it shows how little has been done that
>> the problems are solved.
>>
>> >From the preface to Kant's Prolegomena to any Future Metaphysics:
>>
>> Ist sie Wissenschaft, wie kommt es, daß sie sich nicht, wie andre
>> Wissenschaften, in allgemeinen und daurenden Beifall setzen kann?
>> Ist sie keine, wie geht es zu, daß sie doch unter dem Scheine einer
>> Wissenschaft unaufhörlich groß tut, und den menschlichen Verstand mit
>> niemals erlöschenden, aber nie erfüllten Hoffnungen hinhält? Man mag
>> also entweder sein Wissen oder Nichtwissen demonstrieren, so muß doch
>> einmal über die Natur dieser angemaßten Wissenschaft etwas Sicheres
>> ausgemacht werden; denn auf demselben Fuße kann es mit ihr unmöglich
>> länger bleiben. Es scheint beinahe belachenswert, indessen daß jede
>> andre Wissenschaft unaufhörlich fortrückt, sich in dieser, die doch
>> die Weisheit selbst sein will, deren Orakel jeder Mensch befrägt,
>> beständig auf derselben Stelle herumzudrehen, ohne einen Schritt
>> weiterzukommen. Auch haben sich ihre Anhänger gar sehr verloren, und
>> man siehet nicht, daß diejenigen, die sich stark genug fühlen, in
>> andern Wissenschaften zu glänzen, ihren Ruhm in dieser wagen wollen,
>> wo jedermann, der sonst in allen übrigen Dingen unwissend ist, sich
>> ein entscheidendes Urteil anmaßt, weil in diesem Lande in der Tat
>> noch kein sicheres Maß und Gewicht vorhanden ist, um Gründlichkeit
>> von seichtem Geschwätze zu unterscheiden.
>>
>> First translation from a text file with line breaks:
>>
>> If it is science, how is it that they did not, like other
>> Share sciences in general and may daurenden applause?
>> If it is not, how is it that they do under the guise of a
>> Science constantly doing great, and the human mind with
>> Never dying, but never fulfilled hopes of holding out? One may
>> show, so either his knowledge or ignorance, it must nevertheless
>> Once on the nature of science usurped something secure
>> be identified, because it can not possibly on the same footing with her
>> stay longer. It almost seems belachenswert, however, that any
>> Science fortrückt other incessantly, in this, but the
>> Wisdom wants to be themselves, whose oracles befrägt every man,
>> flipped upside resistance at the same spot, without a step
>> ahead. Even their supporters have not lost much, and
>> things not seen, that those who feel strong enough to
>> shine the other sciences, to risk their reputation in this wish
>> where everyone else is ignorant of all the other things that are
>> presumes a crucial verdict, because in this country, in fact,
>> no safe level and weight is available to thoroughness
>> to be distinguished from shallow chatter.
>>
>> Second translation of the same text entered from an html file:
>>
>> If it is science, how is it that they do not, you can use like other
>> sciences, in general, and daurenden applause? If it is not, how is it that
>> they do under the guise of a science constantly doing great, and holds out
>> the human mind with never dying, but never fulfilled hopes? One may
>> therefore either demonstrate his knowledge or ignorance, yet he must again
>> about the nature of science usurped something certain to be identified,
>> because on the same footing, it can not possibly stay with her longer. It
>> almost seems belachenswert, however, that every other science fortrückt
>> incessantly, in this, but the wisdom that wants to be themselves, whose
>> oracles befrägt everyone, always on the same spot game instead, move forward
>> without a step. Even their supporters have not lost much, and no one sees
>> that those who want to feel strong enough to shine in other sciences, to
>> risk their glory in this, where everyone else is ignorant of all the other
>> things, a presumes decisive verdict, because there is in this country, in
>> fact, no safe level and weight in order to distinguish detail of shallow
>> chatter about.
>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list