Linguistic dark matter

Paul Frank paulfrank at POST.HARVARD.EDU
Fri Dec 17 12:21:59 UTC 2010

On Fri, 17 Dec 2010 11:13 +0000, "Michael Quinion"
<wordseditor at WORLDWIDEWORDS.ORG> wrote:
> Science reports on a massive searchable corpus created from some five
> million books, now available on Google:
> One report is here: . It quotes the researchers:
> "We estimated that 52% of the English lexicon - the majority of words
> used
> in English books - consist of lexical 'dark matter' undocumented in
> standard references."

What's a standard reference? I bet that more than 90% of the technical
terms used in agrochemistry, analytical chemistry
astrochemistry; acoustics, agrophysics and atomic physics; astrobiology,
astrochemistry, astrodynamics, astrometry, astrophysics; atmospheric
sciences; anatomy and astrobiology; automata theory, artificial
intelligence, algebraic computation; algebra, analysis, applied
mathematics, and so on down to the letter z, are not in the OED or in
any other single dictionary. And if you take all the technical terms in
the social sciences, the arts, and other branches of learning, I bet
it's closer to 99%. But that's okay. Tiki mug collectors don't need
English dictionaries to tell them what a tiki mug is. And the rest of us
can look it up in the Wikipedia
(, which which is inching ever
closer to Borges' Library of Babel or the planet Memory Alpha, but will
never actually get there.



Paul Frank
Chinese, German, French, Italian > English
Espace de l'Europe 16
Neuchâtel, Switzerland
paulfrank at
paulfrank at

The American Dialect Society -

More information about the Ads-l mailing list