<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <meta content="text/html; charset=windows-1252"

 http-equiv="Content-Type">

</head>

<body bgcolor="#ffffff" text="#000000">

On 3/11/2010 10:10 AM, chris brew wrote:

<blockquote

 cite="mid:8b53244a1003110710w416a70d1tbb8851d24bb6b19a@mail.gmail.com"

 type="cite"><br>

  <br>

  <div class="gmail_quote">On Thu, Mar 11, 2010 at 8:18 AM, Peter Kolb <span

 dir="ltr"><<a moz-do-not-send="true" href="mailto:pekoli@gmail.com">pekoli@gmail.com</a>></span>

wrote:<br>

  <blockquote class="gmail_quote"

 style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">I

have three comments:<br>

    <br>

1. The text by Kant contains a lot of anaphoric pronouns. From Google's

translation it is obvious that their system does not perform any

pronoun resolution (or at least none that works better than a random

baseline). However, there exist German to English translation engines

on the market that incorporate such components.<br>

  </blockquote>

  <div><br>

  </div>

  <div>I would moderate that conclusion. If, as I suspect, the Google

engine for German to English is a statistical</div>

  <div>one, it will be choosing a translation by optimizing a complex

internal criterion that involves tradeoffs between multiple criteria.

Because SMT systems are not conventionally modular, it is hard to </div>

  <div>say what components they have or do not have.<br>

  </div>

  </div>

</blockquote>

<br>

    Which is why it would be a huge boon to the science of language if

more of the statistical machine translation systems produced some kind

of human-readable report of what they "learn" from their "training"

data.<br>

<pre class="moz-signature" cols="72">-- 

                                -Angus B. Grieve-Smith

                                <a class="moz-txt-link-abbreviated" href="mailto:grvsmth@panix.com">grvsmth@panix.com</a>

</pre>

</body>

</html>