[Corpora-List] Query about nomenclature

Andrew Kehoe Andrew.Kehoe at uce.ac.uk
Fri Mar 11 15:33:01 UTC 2005


John

You need to use the search term "ngram -perl" rather than "ngram not
perl" because, as Stefan Evert pointed out, "ngram not perl" just
returns pages containing all 3 of those words.

Another problem with your method is that Google ignores hyphens in
search terms. One of the pages returned for the term "n-gram" is
http://cpan.dei.uc.pt/authors/id/J/JH/JHI/ngram.pl-1.48&e=8092 but this
page does not contain the word "n-gram" at all, only "ngram" without the
hyphen.

Andrew Kehoe
Research and Development Unit for English Studies
School of English
University of Central England, Birmingham
http://rdues.uce.ac.uk/
 
http://www.webcorp.org.uk/ 

-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of John F. Sowa
Sent: 10 March 2005 01:43
To: Damon Allen Davison
Cc: John Mckenny; CORPORA at HD.UIB.NO
Subject: Re: [Corpora-List] Query about nomenclature

Damon Davison's use of Google inspired me to try
a variation.  I just typed three queries and
got the following number of hits:

Search string            Hits
-------------           ------
ngram                   21,100

ngram not perl             540

n-gram                  85,700

This seems to provide overwhelming evidence for
a hyphen between "n" and "gram".  Since Google
doesn't distinguish capitals, that leaves the
capitalization question unresolved.

John Sowa



More information about the Corpora mailing list