Linguistic dark matter
Ben Zimmer
bgzimmer at BABEL.LING.UPENN.EDU
Fri Dec 17 15:14:52 UTC 2010
On Fri, Dec 17, 2010 at 9:14 AM, Michael Quinion
<wordseditor at worldwidewords.org> wrote:
>
> David Barnhart wrote
>
> > If you haven't noticed I'm skeptical of the "tool".
>
> I'm certainly sceptical of that 52% "undocumented in standard references",
> which was why I quoted that sentence. The figure seems extremely high. As
> I can't get access to the Science article (which is only fee online to
> subscribers), I can't begin to work out its basis.
>
> The researchers seem not to have applied many lexical filters. Proper
> names are included, because they want the corpus to be a cultural tool as
> well as a lexicographical one. Similarly, they allow scientific names
> ("Turdus merula" and the like). I would have thought that - if the
> "standard references" are restricted to general dictionaries - proper and
> scientific names would account for a big part of that missing 52%.
To be fair, proper nouns were included in the researchers' overall
lexical count, but the "dark matter" is not 52% of that number. They
did filter out proper nouns of that part of the analysis, since they
were going for an apples-to-apples comparison with the OED and
Webster's Third. The media coverage doesn't get into these subtleties,
of course.
--bgz
--
Ben Zimmer
http://benzimmer.com/
------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org
More information about the Ads-l
mailing list