Linguistic dark matter

Jonathan Lighter wuxxmupp2000 at GMAIL.COM
Fri Dec 17 13:30:52 UTC 2010


Bad scans by Google must make up a fair number of those "dark" terms and
undermine the authority of the graphs.

A search for "crud," for example, shows that virtually all examples
before the late thirties (allegedly) are bad scans of "cruel" and "crude."
And that's just in English.

JL

On Fri, Dec 17, 2010 at 7:21 AM, Paul Frank <paulfrank at post.harvard.edu>wrote:

> ---------------------- Information from the mail header
> -----------------------
> Sender:       American Dialect Society <ADS-L at LISTSERV.UGA.EDU>
> Poster:       Paul Frank <paulfrank at POST.HARVARD.EDU>
> Subject:      Re: Linguistic dark matter
>
> -------------------------------------------------------------------------------
>
> On Fri, 17 Dec 2010 11:13 +0000, "Michael Quinion"
> <wordseditor at WORLDWIDEWORDS.ORG> wrote:
>
> ------------------------------------------------------------------------------
> >
> > Science reports on a massive searchable corpus created from some five
> > million books, now available on Google: http://ngrams.googlelabs.com/
> >
> > One report is here: http://bit.ly/ffQCmR . It quotes the researchers:
> >
> > "We estimated that 52% of the English lexicon - the majority of words
> > used
> > in English books - consist of lexical 'dark matter' undocumented in
> > standard references."
>
> What's a standard reference? I bet that more than 90% of the technical
> terms used in agrochemistry, analytical chemistry
> astrochemistry; acoustics, agrophysics and atomic physics; astrobiology,
> astrochemistry, astrodynamics, astrometry, astrophysics; atmospheric
> sciences; anatomy and astrobiology; automata theory, artificial
> intelligence, algebraic computation; algebra, analysis, applied
> mathematics, and so on down to the letter z, are not in the OED or in
> any other single dictionary. And if you take all the technical terms in
> the social sciences, the arts, and other branches of learning, I bet
> it's closer to 99%. But that's okay. Tiki mug collectors don't need
> English dictionaries to tell them what a tiki mug is. And the rest of us
> can look it up in the Wikipedia
> (http://en.wikipedia.org/wiki/Tiki_mugs), which which is inching ever
> closer to Borges' Library of Babel or the planet Memory Alpha, but will
> never actually get there.
>
> Paul
>
> --
>
> Paul Frank
> Translator
> Chinese, German, French, Italian > English
> Espace de l'Europe 16
> Neuchâtel, Switzerland
> paulfrank at bfs.admin.ch
> paulfrank at post.harvard.edu
>
> ------------------------------------------------------------
> The American Dialect Society - http://www.americandialect.org
>



-- 
"If the truth is half as bad as I think it is, you can't handle the truth."

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list