Fwd: Finding the long s
Joel S. Berson
Berson at ATT.NET
Tue Oct 11 14:48:18 UTC 2011
Ingenious use of Google Ngrams to trace the decline in use of the long s.
>From: Brian Ogilvie <bwogilvie at GMAIL.COM>
>The disappearance of the long S in English is one of those phenomena
>that can be traced using Google Ngrams--unintentionally, because
>Google's OCR appears to fairly reliably misidentify the 18th-century
>long S as an F. I discovered this while fooling around with the site
>and being stunned at how rare words like "papist" were before 1800;
>then it hit me that I ought to be searching for "papift." Compare
>(I enclosed the URLs in angle brackets, which ought to make them
>clickable even if they cross a line break in your email program, but
>if not, copy and paste them into a browser.)
>If you choose any word with a non-terminal S that is not an English
>word when the S is replaced with an F, you will almost certainly see a
>similar graph, with the F-version (i.e. the long S) starting to
>decline c. 1790, crossing the short S right around 1800, and almost
>totally gone by 1815. Moreover, the upward slope on the S version is
>nearly a mirror image of the downward slope on the F version. The best
>words for testing the long S are common words, especially function
>words--though not always. Looking for she/fhe, for instance, shows a
>distinct upward trend, first of fhe and then of she. Again, though,
>1800 is the point where the long S is overtaken by its rival:
>Caveats: my completely unscientific examination of some of the
>17th-century texts suggests that Google is better at identifying the
>long S in them, possibly because the half-crossbar tends to be longer.
>And of course majuscules, such as in titles, are not identified.
>Still, the evidence seems quite strong, and it's consistent across a
>range of words.
>If we want to know _why_ this happened, research such as Paul Nash's
>article, which Jerry Morris summarized, is crucial.
>P.S. The implications for anyone who does searches in early modern
>full text databases are obvious....
The American Dialect Society - http://www.americandialect.org
More information about the Ads-l