Siring [was: Linguistic dark matter]

Joel S. Berson Berson at ATT.NET
Sat Dec 18 00:34:08 UTC 2010


Having been prompted by David Barnhart, I must
now correct my previous closing line:

How many (more) errors of scholarship will Google
be whelping with this new "tool"?

Joel

At 12/17/2010 09:41 AM, Joel S. Berson wrote:
>I forward a message from an 18th-century scholar on another list:
>>My immediate sense was that my painstaking
>>analysis of 200 novels available in three
>>different databases could have been done with
>>the click of a button. However I encountered
>>immediate problems with this new Google search
>>tool. On the first search I did, the first six
>>entries for 1795 included 1) the complete works
>>of Milton; 2) a collection of British poetry
>>(focus on Spenser and Shakespeare), and a
>>collection of historic British theatre; 3) two
>>dictionaries; and, exactly one book actually
>>published in 1795. The usual problems with the
>>initial scanning and indexing of documents.
>
>How many (more) errors of scholarship will Google
>be siring with this new "tool"?
>
>Joel
>
>At 12/17/2010 08:30 AM, Jonathan Lighter wrote:
>>Bad scans by Google must make up a fair number of those "dark" terms and
>>undermine the authority of the graphs.
>>
>>A search for "crud," for example, shows that virtually all examples
>>before the late thirties (allegedly) are bad scans of "cruel" and "crude."
>>And that's just in English.
>>
>>JL
>>
>>On Fri, Dec 17, 2010 at 7:21 AM, Paul Frank
>><paulfrank at post.harvard.edu>wrote:
>>
>> > ---------------------- Information from the mail header
>> > -----------------------
>> > Sender:       American Dialect Society <ADS-L at LISTSERV.UGA.EDU>
>> > Poster:       Paul Frank <paulfrank at POST.HARVARD.EDU>
>> > Subject:      Re: Linguistic dark matter
>> >
>> >
>>-------------------------------------------------------------------------------
>> >
>> > On Fri, 17 Dec 2010 11:13 +0000, "Michael Quinion"
>> > <wordseditor at WORLDWIDEWORDS.ORG> wrote:
>> >
>> >
>>------------------------------------------------------------------------------
>> > >
>> > > Science reports on a massive searchable corpus created from some five
>> > > million books, now available on Google: http://ngrams.googlelabs.com/
>> > >
>> > > One report is here: http://bit.ly/ffQCmR . It quotes the researchers:
>> > >
>> > > "We estimated that 52% of the English lexicon - the majority of words
>> > > used
>> > > in English books - consist of lexical 'dark matter' undocumented in
>> > > standard references."
>> >
>> > What's a standard reference? I bet that more than 90% of the technical
>> > terms used in agrochemistry, analytical chemistry
>> > astrochemistry; acoustics, agrophysics and atomic physics; astrobiology,
>> > astrochemistry, astrodynamics, astrometry, astrophysics; atmospheric
>> > sciences; anatomy and astrobiology; automata theory, artificial
>> > intelligence, algebraic computation; algebra, analysis, applied
>> > mathematics, and so on down to the letter z, are not in the OED or in
>> > any other single dictionary. And if you take all the technical terms in
>> > the social sciences, the arts, and other branches of learning, I bet
>> > it's closer to 99%. But that's okay. Tiki mug collectors don't need
>> > English dictionaries to tell them what a tiki mug is. And the rest of us
>> > can look it up in the Wikipedia
>> > (http://en.wikipedia.org/wiki/Tiki_mugs), which which is inching ever
>> > closer to Borges' Library of Babel or the planet Memory Alpha, but will
>> > never actually get there.
>> >
>> > Paul
>> >
>> > --
>> >
>> > Paul Frank
>> > Translator
>> > Chinese, German, French, Italian > English
>> > Espace de l'Europe 16
>> > Neuchâtel, Switzerland
>> > paulfrank at bfs.admin.ch
>> > paulfrank at post.harvard.edu
>> >
>> > ------------------------------------------------------------
>> > The American Dialect Society - http://www.americandialect.org
>> >
>>
>>
>>
>>--
>>"If the truth is half as bad as I think it is, you can't handle the truth."
>>
>>------------------------------------------------------------
>>The American Dialect Society - http://www.americandialect.org
>
>------------------------------------------------------------
>The American Dialect Society - http://www.americandialect.org

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list