new Google Books glitch
Garson O'Toole
adsgarsonotoole at GMAIL.COM
Wed Mar 28 13:12:10 UTC 2012
Thanks to Jon for initiating this thread and to Stephen, Fred, and
Joel for responding.
Below is an example of Google Advanced Book Search producing indirect
matches that do not contain the target string of a search. The website
of the Guardian (UK) newspaper has a webpage about Mark Twain dated
July 22, 2008:
http://www.guardian.co.uk/books/2008/jun/11/marktwain
The following quote appears at the top of the page, and it is often
attributed to Twain:
[Begin excerpt]
The man who doesn't read good books has no advantage over the man who
can't read them.
[End excerpt]
There is no compelling evidence that Twain ever offered this bit of
homespun wisdom. Ralph Keyes in "The Quote Verifier" notes that "one
has ever confirmed that he wrote or said it."
The TwainQuotes website of Barbara Schmidt lists it as an "Unverified
quote" and provides an anonymous citation dated December 31, 1914.
http://www.twainquotes.com/Reading.html
If one performs an advanced search for the exact string "no advantage
over the man who can't read" and a publication date upper bound of
1914 several matches are presented:
[Begin summary of matches on first page]
Life on the Mississippi
Mark Twain - 1906
A tramp abroad
Mark Twain - 1907
The Adventures of Huckleberry Finn:
Mark Twain - 1912
The Prince and the Pauper: A Tale for Young People of all Ages
Mark Twain - 1909
Moby Dick; or, The white whale
Herman Melville - 1892
[End summary of matches on first page]
I do not believe that the string appears in any of the works above.
Garson
On Wed, Mar 28, 2012 at 6:40 AM, Shapiro, Fred <fred.shapiro at yale.edu> wrote:
> ---------------------- Information from the mail header -----------------------
> Sender: American Dialect Society <ADS-L at LISTSERV.UGA.EDU>
> Poster: "Shapiro, Fred" <fred.shapiro at YALE.EDU>
> Subject: Re: new Google Books glitch
> -------------------------------------------------------------------------------
>
> Actually, when I go into Advanced Book Search and run searches that I have run many times in the past, it looks like it is behaving much the same as it has.
>
> Fred Shapiro
>
>
>
> ________________________________________
> From: American Dialect Society [ADS-L at LISTSERV.UGA.EDU] on behalf of Shapiro, Fred [fred.shapiro at YALE.EDU]
> Sent: Wednesday, March 28, 2012 6:27 AM
> To: ADS-L at LISTSERV.UGA.EDU
> Subject: Re: new Google Books glitch
>
> This is all pretty distressing for those of us who try to use Google Books to find early uses of words, phrases, quotations, and proverbs. Is there any kind of "advanced search" or "classic Google Books search" that evades the new fuzziness?
>
> It is beginning to seem that, once Hathi Trust gets past its own search problems and develops a more robust search capability, Hathi Trust will supplant Google Books as a tool for historical-lexicographical research.
>
> Fred Shapiro
>
>
>
> ________________________________________
> From: American Dialect Society [ADS-L at LISTSERV.UGA.EDU] on behalf of Stephen Goranson [goranson at DUKE.EDU]
> Sent: Wednesday, March 28, 2012 5:09 AM
> To: ADS-L at LISTSERV.UGA.EDU
> Subject: Re: new Google Books glitch
>
> Yes, Google Book search has changed.
> First, apparently, it returns hits with synonyms of the search words.
> Second, apparently, it returns, e.g., phrases found in recent reviews of classic books, listing the book rather than the review as the source.
> These attempts to be helpful, to me, are not.
>
> Stephen Goranson
> http://www.duke.edu/~goranson
> _______________________________________
> From: American Dialect Society [ADS-L at LISTSERV.UGA.EDU] on behalf of Garson O'Toole [adsgarsonotoole at GMAIL.COM]
> Sent: Tuesday, March 27, 2012 10:04 PM
> To: ADS-L at LISTSERV.UGA.EDU
> Subject: Re: [ADS-L] new Google Books glitch
>
> The results generated by the Google search engine have, I think,
> changed noticeably during the past two months. The match algorithm has
> been updated so that it now performs some kind of indirect association
> mapping to generate more matches.
>
> When one performs a search looking for a precise string in quotes the
> results now include many books that do not contain the quoted string.
> This behavior has been mentioned in the past on the ADS list, but now
> many more spurious matches are being displayed.
>
> For example, when I search for the string "inflamed with wild notions"
> the following items are listed as matches:
>
> The Republic of Plato: Volume 1
> Socrates: a translation of the Apology, Crito, and parts of the ...
> Clouds Aristophanes, Milton Wylie Humphreys
>
> The string does not appear in these books. But the string is part of a
> popular quote that began to appear in the 1960s (I think). It is often
> misattributed to Plato:
>
> [Begin excerpt]
> What is happening to our young people? They disrespect their elders,
> they disobey their parents. They ignore the law. They riot in the
> streets inflamed with wild notions. Their morals are decaying. What is
> to become of them?
> [End excerpt]
>
> So the Google match algorithm now apparently performs some type of
> indirect matching. The algorithm may look at the set of matches in the
> full Google database and creates some kind of signature. It then
> matches items to the signature. That is a wild guess. Whatever
> technique is being used it is clever. However, it makes my task more
> difficult.
>
> For example, when I search for quotes that are incorrectly credited to
> Mark Twain Google now presents matches for several works of Twain.
> This occurs despite the fact that the target string is absent in
> Twain's oeuvre.
>
> Perhaps other list members have observed this behavior.
>
> Garson
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org
------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org
More information about the Ads-l
mailing list