is Google reliable?

Laurence Horn laurence.horn at YALE.EDU
Thu Dec 12 20:03:15 UTC 2002

>abatefr at EARTHLINK.NET,Net writes:
>>Google is a very lame tool for lexical research compared to, say, a
>>dynamic corpus of contemporary English.
>While I have not a very high opinion of a "straight" google search, it
>is made more useful for general linguistic research in news media by
>using "Google News."  This resource is described in their own words:
>Google News presents information culled from approximately 4,000 news
>sources worldwide and automatically arranged to present the most
>relevant news first. Topics are updated continuously throughout the
>day, so you will see new stories each time you check the page.

In defense of the recently maligned (straight, non-news) Google...I'm
sure Frank, David et al. are right for some searches.  But if you're
trying to track down innovative uses of language by ordinary people
and/or by those writing in a wide variety of (non-news) places, actually provides better feedback than the
site.   In my work on un-nouns (most recently checking out various
antedates for "unmarriage", prompted by a New Yorker talk of the town
piece in the 12/9/02 issue) and on "spitten image", I've used google
to confirm, for example, that (a) "spitten" (both as a participle in
general and in "spitten image" in particular) is alive and well in
various non-standard varieties of English and that (b) "unmarriage",
with meanings ranging from "the state of being in a committed
relationship that isn't a marriage" to a near-synonym for divorce or
separation, has been around since at least the early 1970's.  The
latter information is not available in the OED, AHD4, or through, which in fact brings up one hit for "unmarriage",
that same recent New Yorker piece that prompted my search in the
first place.  So if you find google lame or unworthy for lexical
research, fine; I'll keep it in my bookmarks.


P.S.  It's also quite handy for checking out reanalyses in progress,
in a way that more carefully edited resources (the ones indexed by or Nexis) aren't

More information about the Ads-l mailing list