Linguistic dark matter

Arnold Zwicky zwicky at STANFORD.EDU
Fri Dec 17 16:14:34 UTC 2010


On Dec 17, 2010, at 3:13 AM, Michael Quinion wrote:

> Science reports on a massive searchable corpus created from some five
> million books, now available on Google: http://ngrams.googlelabs.com/
>
> One report is here: http://bit.ly/ffQCmR . It quotes the researchers:
>
> "We estimated that 52% of the English lexicon - the majority of words used
> in English books - consist of lexical 'dark matter' undocumented in
> standard references."

GN, 12/16/10: Humanities research with the Google books corpus:
 http://languagelog.ldc.upenn.edu/nll/?p=2847

which doesn't, however, take up the issue of "linguistic dark matter", though more postings are to come.

arnold

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list