[Corpora-List] Interlingual Machine Translation Systems (fwd)
Yorick WIlks
yorick at dcs.shef.ac.uk
Sun Nov 21 14:50:44 UTC 2004
I did evaluations of SYSTRAN for the US AirForce back in 1980 or so and
it was getting then about 70-80 % of Russian sentences acceptably
correct into English and about 60% of other languages it did. SYSTRAN
has now been working for the European Commission at Luxemburg
translating between English/French/German for about 25 years and does
millions of words a week-they could not possibly function without it. I
am afraid your 1% guess has no relation to the facts at all---look at
the Commission's website.
YW
On Sunday, November 21, 2004, at 09:53 AM, Sergey Protasov wrote:
> Ok, Yorick,
>
> We should define the term "good" of MT systems.
>
> If we take arbitraty sentences from some very big not specialized
> english corpus and translate it, using expert-man-translator, we have
> about 80-90% correctly translated sentences.
> Let's define this as the best quality of translation.
>
> So "good" translation is about 45-50% of correct sentences.
>
> And satisfactory - about 20% of correct sentences. (logarithm scale)
>
> I think, Systran and any other MT system can translate correctly not
> more than one percent of sentences, arbitrary selected from big > corpus.
>
> This is not "good" in any case, IMHO.
>
> If I wrong about 1% - let me know please.
>
> Im sorry for my bad english.
>
>
> Yorick WIlks wrote:
>> This reply from Russia is total nonsense, unless "good" means
>> something utterly impractical. There are many evaluated MT systems
>> that do a reasonable job (i.e. giving a good indication of what a
>> document says) and some are available free on search sites as well
>> all know. The world's oldest and strongest system SYSTRAN sometimes
>> does a very good job. recommending a 20 word MT system shows utter
>> ignorance of the last forty years.
>> Yorick Wilks
>> On Friday, November 19, 2004, at 12:18 PM, Sergey Protasov wrote:
>>>
>>> Eric,
>>>
>>> There are no good MT systems today at all.
>>> So there are no good opensource MT systems today.
>>>
>>> ThoughtTreasure is very big system for teaching and it have bad
>>> syntax parser. (It fails, if senstence have more that 7-10 words)
>>>
>>> I recommend you to see link grammar translator for teaching.
>>> http://www.link.cs.cmu.edu/link/submit-to-translator.html
>>> It show very good translations, but It have 20 words in vocab only..
>>>
>>> You can add more words... But it is not trivial...
>>>
>>> If you intresting in statistical mashine translation, forget I said
>>> before and go to here
>>>
>>> http://www.isi.edu/licensed-sw/rewrite-decoder/
>>>
>>>
>>> It simple, but you will do not know how it works..
>>>
>>>
>>>
>>> --
>>> Sergey Protasov
>>> PhD student in Computational Linguistics,
>>> Moscow Institute of Physics and Technology
>>>
>>>
>>>
>>>
>>> Eric Atwell wrote:
>>>
>>>> Sergey,
>>>> do you have any evaluation report or other evidence of how good
>>>> this OpenSource MT system is? Bogdan Babych, researcher here at
>>>> Leeds,
>>>> is thinking of developing a demo MT system for research and
>>>> teaching,
>>>> but it may be worth considering adapting an existing oepn-source
>>>> system
>>>> regards
>>>> Eric Atwell
>>>
>>>
>>>
>>>
>
>
More information about the Corpora
mailing list