Linguistic dark matter

Fri Dec 17 11:13:56 UTC 2010

Science reports on a massive searchable corpus created from some five
million books, now available on Google:

One report is here: . It quotes the researchers:

"We estimated that 52% of the English lexicon - the majority of words used
in English books - consist of lexical 'dark matter' undocumented in
standard references."

