[Corpora-List] problems with Google counts

Philip Resnik resnik at umiacs.umd.edu
Fri Mar 18 16:02:45 UTC 2005


Ring Low <mlow at acsu.buffalo.edu> writes:
>   I agree that using Google to conduct linguistic studies has gotten more
>   and more difficult since then, as the design of the search engine has
>   been changing due to commercial reasons.  We do need a search engine
>   design specically for linguistic studies.

A few people wrote me to suggest that this might be a good opportunity
to mention the Linguist's Search Engine (http://lse.umiacs.umd.edu/).
And it is, assuming we carefully distinguish between linguistic
studies that do and do not rely on automatic counting.  A great deal
of linguistic insight can be gained by doing linguistically informed
searches, and then looking at the data with the same methodological
caveats that linguists must traditionally heed: you need to be sure
the data comes from a native speaker, that the word (or construction,
or sentence) is being used in the intended meaning, that the context
is not exercising some unusual influence, etc.

  Philip


  ----------------------------------------------------------------
  Philip Resnik, Associate Professor
  Department of Linguistics and Institute for Advanced Computer Studies

  1401 Marie Mount Hall            UMIACS phone: (301) 405-6760
  University of Maryland           Linguistics phone: (301) 405-8903
  College Park, MD 20742 USA	   Fax: (301) 314-2644 / (301) 405-7104
  http://umiacs.umd.edu/~resnik	   E-mail: resnik at umiacs.umd.edu



More information about the Corpora mailing list