[Corpora-List] Man bites dog
John D Burger
john at mitre.org
Mon Nov 21 15:46:23 UTC 2011
I'm not completely positive I understand Kay's point, but I gather we are to assume that the language model says that "dog bites man" is more likely than "man bites dog" - arguing about whether this is likely misses the point, I believe.
Even if this is true, LMs are but one source of evidence that SMT systems use. Moses, for instance, will have specific statistics about how often the phrase to the left of "mord" moves to the right of its translation, and so on for the other words and phrases. This probably rarely happens, so the system will weigh this against the LM's preference. This can be modeled more appealingly if Moses is trained with a syntactic model. In this case, I think the odds would be very low that the "subject" and "object" of "mord" would switch places during translation.
- John Burger
MITRE
On Nov 21, 2011, at 09:30 , maxwell wrote:
> On Mon, 21 Nov 2011 12:01:59 +0000, "Jimmy O'Regan" <joregan at gmail.com>
> wrote:
>> ...More importantly, it assumes that
>> 'dog bites man' is a more frequent trigram in English (i.e., the
>> target language model), which doesn't seem to be true
>> ...
>> which makes sense in hindsight, when you consider the idiomatic value
>> of 'man bites dog'.
>
> Yes, 'man bites dog' has been around the block a few times, and 'dog bites
> man' is probably not that common because it isn't newsworthy--which is
> precisely the point of the old saying, "If a dog bites a man, that's not
> news. If a man bites a dog, that's news."
>
> So use your imagination: "Baby kisses politician" (6 hits, vs. 18,000 for
> "Politician kisses baby"). I was going to say, "Man bites snake", but that
> seems oddly common, even in comparison to "Snake bites man." My favorite
> is "Man bites snake, faces animal cruelty charges." Which says something
> odd about our world, but never mind.
>
> Your other points (about bigrams, etc.) are of course relevant.
>
> Mike Maxwell
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list