A million English words, or only 600,000? Either way, it's a language packed with more words than you'll ever need

Laurence Horn laurence.horn at YALE.EDU
Wed Jul 9 15:33:11 UTC 2008

At 3:04 PM +0000 7/9/08, Tom Zurinskas wrote:
>One person said there are 2 billion English words.  Another said 1
>million.  That's a difference factor of 2,000.   That's like looking
>at a tree and one person estimating it's 1 inch tall, while the
>other estimates it's 2000 inches tall (170 feet).

No it's not.  Nobody (except you) was claiming that there are 2
billion *different words* in the English language.  There was
discussion of a 2 billion word database, but unless every word in the
the database is distinct from every other word (i.e. no repetitions,
so every word has a frequency count of 1), the number of *types*
(which is what's under discussion in this thread) will be far smaller
than the number of *tokens*.  In "The woman discussed the letter with
the man" there are 8 word tokens but 6 word types.  Pick up a
newspaper and see how many paragraphs (if any) you can find that have
the same number of word types and word tokens.  This has been
explained several times on the list, so I'm not exactly sure why I'm
trying to do it again...


The American Dialect Society - http://www.americandialect.org

More information about the Ads-l mailing list