Ngrams and proper nouns

Victor Steinbok aardvark66 at GMAIL.COM
Wed Dec 22 20:03:21 UTC 2010

The paper did claim that for their data, but, I believe, it was done
manually. Also, n-gams are "case sensitive", although I found errors in
that too. Besides, if they relied on that, they would have eliminated a
significant number of "false" positives for capitalization (i.e., capped
but not proper).


On 12/22/2010 11:33 AM, Joel S. Berson wrote:
> I believe I have read here that Ngrams (or rather, I suppose, its
> database) has eliminated (reduced?) the occurrence of proper nouns.
> How do they distinguish proper nouns from common nouns that are
> homographs?  And particularly, for the periods where all nouns were
> generally capitalized?
> Joel

The American Dialect Society -

More information about the Ads-l mailing list