A million English words, or only 600,000? Either way, it's a language packed with more words than you'll ever need
Dave Wilton
dave at WILTON.NET
Wed Jul 9 16:18:40 UTC 2008
More importantly, how do you define "word"? There are over one million named
species of animals (mostly insects, of which there are over 920,000 named
species). Are these names "words"? Most will never appear in a dictionary,
even specialty biological dictionaries.
And are they English? They're generally Latinate coinages, but they're not
truly Latin.
An estimate of a billion English words is off by a couple orders of
magnitude. But you could come up with reasonable estimates for English words
in the millions, perhaps as many as ten million if you stretch the
definition of "reasonable" to the breaking point.
It all depends on how and what you count.
And to what point? The exercise, while an interesting curiosity, is not very
useful.
-----Original Message-----
From: American Dialect Society [mailto:ADS-L at LISTSERV.UGA.EDU] On Behalf Of
Laurence Horn
Sent: Wednesday, July 09, 2008 8:33 AM
To: ADS-L at LISTSERV.UGA.EDU
Subject: Re: A million English words, or only 600,000? Either way, it's a
language packed with more words than you'll ever need
At 3:04 PM +0000 7/9/08, Tom Zurinskas wrote:
>One person said there are 2 billion English words. Another said 1
>million. That's a difference factor of 2,000. That's like looking
>at a tree and one person estimating it's 1 inch tall, while the
>other estimates it's 2000 inches tall (170 feet).
No it's not. Nobody (except you) was claiming that there are 2
billion *different words* in the English language. There was
discussion of a 2 billion word database, but unless every word in the
the database is distinct from every other word (i.e. no repetitions,
so every word has a frequency count of 1), the number of *types*
(which is what's under discussion in this thread) will be far smaller
than the number of *tokens*. In "The woman discussed the letter with
the man" there are 8 word tokens but 6 word types. Pick up a
newspaper and see how many paragraphs (if any) you can find that have
the same number of word types and word tokens. This has been
explained several times on the list, so I'm not exactly sure why I'm
trying to do it again...
LH
------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org
------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org
More information about the Ads-l
mailing list