Fwd: Finding the long s

Joel S. Berson Berson at ATT.NET
Tue Oct 11 14:48:18 UTC 2011


Ingenious use of Google Ngrams to trace the decline in use of the long s.

Joel

>From: Brian Ogilvie <bwogilvie at GMAIL.COM>
>
>The disappearance of the long S in English is one of those phenomena
>that can be traced using Google Ngrams--unintentionally, because
>Google's OCR appears to fairly reliably misidentify the 18th-century
>long S as an F. I discovered this while fooling around with the site
>and being stunned at how rare words like "papist" were before 1800;
>then it hit me that I ought to be searching for "papift." Compare
>these graphs:
>
><http://books.google.com/ngrams/graph?content=papist%2Cpapift&year_start=1600&year_end=1900&corpus=0&smoothing=3>
><http://books.google.com/ngrams/graph?content=trust%2Ctruft&year_start=1600&year_end=1900&corpus=0&smoothing=3>
><http://books.google.com/ngrams/graph?content=set%2Cfet&year_start=1600&year_end=1900&corpus=0&smoothing=3>
>
>(I enclosed the URLs in angle brackets, which ought to make them
>clickable even if they cross a line break in your email program, but
>if not, copy and paste them into a browser.)
>
>If you choose any word with a non-terminal S that is not an English
>word when the S is replaced with an F, you will almost certainly see a
>similar graph, with the F-version (i.e. the long S) starting to
>decline c. 1790, crossing the short S right around 1800, and almost
>totally gone by 1815. Moreover, the upward slope on the S version is
>nearly a mirror image of the downward slope on the F version. The best
>words for testing the long S are common words, especially function
>words--though not always. Looking for she/fhe, for instance, shows a
>distinct upward trend, first of fhe and then of she. Again, though,
>1800 is the point where the long S is overtaken by its rival:
><http://books.google.com/ngrams/graph?content=she%2Cfhe&year_start=1600&year_end=1900&corpus=0&smoothing=3>
>
>Caveats: my completely unscientific examination of some of the
>17th-century texts suggests that Google is better at identifying the
>long S in them, possibly because the half-crossbar tends to be longer.
>And of course majuscules, such as in titles, are not identified.
>Still, the evidence seems quite strong, and it's consistent across a
>range of words.
>
>If we want to know _why_ this happened, research such as Paul Nash's
>article, which Jerry Morris summarized, is crucial.
>
>Brian
>
>P.S. The implications for anyone who does searches in early modern
>full text databases are obvious....

------------------------------------------------------------
The American Dialect Society - http://www.americandialect.org



More information about the Ads-l mailing list