new Google Books glitch

Garson O'Toole adsgarsonotoole at GMAIL.COM
Wed Mar 28 02:04:03 UTC 2012


The results generated by the Google search engine have, I think,
changed noticeably during the past two months. The match algorithm has
been updated so that it now performs some kind of indirect association
mapping to generate more matches.

When one performs a search looking for a precise string in quotes the
results now include many books that do not contain the quoted string.
This behavior has been mentioned in the past on the ADS list, but now
many more spurious matches are being displayed.

For example, when I search for the string "inflamed with wild notions"
the following items are listed as matches:

The Republic of Plato: Volume 1
Socrates: a translation of the Apology, Crito, and parts of the ...
Clouds Aristophanes, Milton Wylie Humphreys

The string does not appear in these books. But the string is part of a
popular quote that began to appear in the 1960s (I think). It is often
misattributed to Plato:

[Begin excerpt]
What is happening to our young people? They disrespect their elders,
they disobey their parents. They ignore the law. They riot in the
streets inflamed with wild notions. Their morals are decaying. What is
to become of them?
[End excerpt]

So the Google match algorithm now apparently performs some type of
indirect matching. The algorithm may look at the set of matches in the
full Google database and creates some kind of signature. It then
matches items to the signature. That is a wild guess. Whatever
technique is being used it is clever. However, it makes my task more
difficult.

For example, when I search for quotes that are incorrectly credited to
Mark Twain Google now presents matches for several works of Twain.
This occurs despite the fact that the target string is absent in
Twain's oeuvre.

Perhaps other list members have observed this behavior.

Garson

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list