more bistromathics at Google

Benjamin Zimmer bgzimmer at BABEL.LING.UPENN.EDU
Wed Mar 12 16:54:46 UTC 2008


On Wed, Mar 12, 2008 at 10:48 AM, Joel S. Berson <Berson at att.net> wrote:
>
> At 3/12/2008 10:35 AM, Mark Mandel wrote:
> >Search for pages containing "treebank"; then "treebank" but not "forestry";
> >then "treebank" but not "forestry" or "forest". Successive removal of
> >subsets should make smaller sets, right?
> >
> >Personalized Results 1 - 100 of about 138,000 for treebank. (0.44 seconds)
> >Personalized Results 1 - 100 of about 117,000 for treebank -forestry.
> >(0.39seconds)
> >Personalized Results 1 - 100 of about 212,000 for treebank -forestry
> >-forest. (0.37 seconds)
> >
> >Wrong??
> >
> >Bistromathics <http://hhgproject.org/entries/bistromathics.html>.
>
> I've often been puzzled by my encounters with this phenomenon.  Now I
> understand -- Google's search engine is implemented by googols of
> maitre d's at typewriters, on short shifts.  (That accounts for the
> contents as well as the numbers: immense quantities of misspellings
> and incorrect grammar along with the Shakespeare.)

Disjunctive queries highlight the overall inaccuracy of Google's
search result numbers. Here's me complaining about it in Nov. '05:

http://itre.cis.upenn.edu/~myl/languagelog/archives/002658.html

See also the advice in that post about obtaining more accurate numbers
from Google's "most relevant" results rather than overall results.
Unfortunately this trick only works on queries yielding results under
Google's 1,000-page cutoff.

--Ben Zimmer

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list