Something Borrowed ...

Tom Zurinskas truespel at HOTMAIL.COM
Fri Nov 13 23:17:51 UTC 2009

it was said
"Any small sample will invariably have a larger number of words from OE.Since most of our basic vocabulary--the most common 2,000 words or so arefrom OE, the percentage of words from OE will drop as the sample grows."

No they won't if we are talking about typical text.  The oft repeated words will still hold sway.  The small sample is perfectly fine as is.  Take another and another.  Many small samples make a big one.

It's one thing to talk about a list of words such as a dictionary list.  It's another to consider words in print, which include repetition.  See truespel books one and four as a prime example.

Tom Zurinskas, USA - CT20, TN3, NJ33, FL7+
see phonetic spelling

> ---------------------- Information from the mail header -----------------------
> Sender: American Dialect Society
> Poster: Dave Wilton
> Subject: Re: Something Borrowed ...
> -------------------------------------------------------------------------------
> Sonnet 18 is not a representative sample. It is very small. It is by a
> single writer. It is a single work. It is writing of a different era. You
> cannot extrapolate from this extremely limited sample (I would not call it a
> "corpus," which implies a degree of comprehensiveness that a single sonnet
> lacks) to draw conclusions about the language as a whole.
> Any small sample will invariably have a larger number of words from OE.
> Since most of our basic vocabulary--the most common 2,000 words or so are
> from OE, the percentage of words from OE will drop as the sample grows. Only
> 30% of the words in the sonnet are repeated. As the sample grows, the number
> of repeated words will also grow. The following figures are taken from
> Christopher Cannon's "The Making of Chaucer's English" (Cambridge Univ
> Press, 1998). The "Epilogue to the Nun's Priest's Tale is 120 words long
> (about the same as Sonnet 18). 88 of these words are "unique" (defined as
> having a headword in the MED). This is about the same for the sonnet. But
> take Chaucer's "Troilus and Chriseyde," at 65,625 words total, it has only
> 3,612 unique words, or a mere 5.5%.
> Unfortunately, Cannon does not give figures for OE v. OF origins (but he
> does include a complete glossary of Chaucer's corpus with origins marked, so
> you can tally them up if you want to spend the time). I did a similar study
> of Thomas Hoccleve's (a 15th century protégé of Chaucer) "Complaint" and
> "Dialogue" and found:
> "Complaint" 3,200 words 88% Germanic; 803 unique words 67%
> Germanic
> "Dialogue" 6,394 words 87% Germanic; 1,205 unique words 58%
> Germanic
> (I did not distinguish between OE and later borrowings from Dutch and other
> languages, but the overwhelming majority of the Germanic words were from OE.
> Similarly, the Romance words include ones borrowed directly from Latin, but
> most are OF or AF.)
> Looking at the words repeated the most in the two poems shows the prevalence
> of OE origins for the most common words. The 55 most repeated words in the
> poem all have Germanic roots. It wasn't until the 56th most common word that
> I found one with an OF root. The next OF root came in at 91 on the list of
> most common words. But we can see in the Hoccleve samples, as the size of
> the sample grows, and we start diluting the influence of the most common
> words, the percentage of words with a Romance origin also grows.
> This also gets us into the writings of a single writer. Hoccleve is not
> known for his aureate or Latinate diction, and we should expect a rather
> high percentage of Germanic words in his writing. A different 15th century
> writer, Lydgate for instance, should have different results and more Romance
> words in his corpus. You need to look at a corpus comprising many writers
> before drawing general conclusions. You also need to examine multiple
> genres. Poetry is one thing, but the diction will be different for
> scientific and technical works, political speeches, friendly letters, etc.
> You also need to examine multiple works. What was Shakespeare trying to
> accomplish with Sonnet 18? His diction here is very particular, very simple,
> and as a result very Germanic. He is writing to achieve a particular effect,
> and this affects his diction. Is he, for example, rebelling against the
> prevalence of "inkhorn" terms in Elizabethan poetry and demonstrating
> virtuosity with simple words? And era makes a difference. Chaucer and
> Hoccleve, which I chose because the numbers were at hand, are not good
> examples for how we write and speak today, nor is Elizabethan English. If
> you want to draw conclusions about the language as she is spoken today, you
> need a large and broad corpus of 21st century works.
> -----Original Message-----
> From: American Dialect Society [mailto:ADS-L at LISTSERV.UGA.EDU] On Behalf Of
> Robin Hamilton
> Sent: Friday, November 13, 2009 2:25 AM
> Subject: Something Borrowed ...
> In the wake of the recent thread on this list to do with the percentage of
> the English vocabulary which is "native" (so to speak) and that which is
> borrowed, I determined to explore this question by examining a
> representative corpus of English. The results of this study were somewhat
> startling, to say the least.
> It emerged that 80% of English words are native in origin, with the
> remaining 20% being taken entirely from Old French. Furthermore, the Old
> French borrowings are found in a narrowly restricted period of one hundred
> years [N1]. Words borrowed from Latin and Greek were conspicuously absent
> [N2].
> The corpus of English chosen for study encompasses the entirety of
> Shakespeare's Sonnet 18, and identification of the origins of the words
> found there was performed by reference to The Oxford English Dictionary,
> online edition [N3].
> The Corpus of Lexical Items:
> Shall I compare thee to a Summers day?
> Thou art more louely and more temperate:
> Rough windes do shake the darling buds of Maie,
> And Sommers lease hath all too short a date:
> Sometime too hot the eye of heauen shines,
> And often is his gold complexion dimm'd,
> And euery faire from faire some-time declines,
> By chance, or natures changing course vntrim'd:
> But thy eternall Sommer shall not fade,
> Nor loose possession of that faire thou ow'st,
> Nor shall death brag thou wandr'st in his shade,
> When in eternall lines to time thou grow'st,
> So long as men can breath or eyes can see,
> So long liues this, and this giues life to thee.
> SHAKE-SPEARES SONNETS published by John Thorpe in 1609.
> © 1995, 1998, 2004 Hardy M. Cook and Ian Lancashire
> The actual percentage figures, rounded up or down to whole numbers, are:
> Words:
> Total Unique [$1] Important [$2]
> 114 81 60
> OE 100 / 88% 68 / 84% 47 / 78% [N4]
> OF 14 / 12% 13 / 16% 13 / 22%
> [$1] - eliminating repetitions and plural forms of the same word.
> [$2] - eliminating conjunctions, prepositions and pronouns to leave nouns,
> verbs, adjectives, etc.
> (In order to clarify these figures, words noted by the OED as either Old or
> Middle English in origin, without a foreign source, are subsumed under the
> general rubric, OE. Similarly, the Old French, Anglo-French, and French of
> the OED all huddle together here as OF.)
> I leave it to others to determine why the "myth" of English borrowing from
> Latin, and indeed any language other than Old French, has persisted
> unchallenged for so long. It is sufficient for me to have done my small
> part in dispelling the fug of disinformation which has persistently
> beclouded the study of this aspect of English lexicography.
> Caveats and Further Considerations
> To be representative of the full historical range of English, this study
> should of course be extended to take in a wider range of corpora from
> periods other than the early seventeenth century. It is for this reason
> that the author of this paper will be actively seeking financial support in
> order to extend his conclusions to encompass the sixteenth century (Thomas
> Wyatt, "Farewell love, and all thy laws forever" [*N1]), the later
> seventeenth century (Milton on his late departed saint), the nineteenth
> century (Wordsworth on Westminster Bridge), and the twentieth century
> (Rupert Brooke, "The Soldier" - 'If I should die, think only this of me'
> [*N2].
> The eighteenth century is unfortunately barren of suitable texts, no sonnets
> having been committed to writing throughout this period, and thus must be
> considered a _locus incognitus lexicalium_. This apparent deficiency in the
> scope of the study will be countered by a consideration of the first of
> Elizabeth Barrett Browning's _Sonnets from the Portuguese_ in order to
> determine whether there are any gender-specific aspects to English lexical
> borrowing, and Edwin Morgan's _50 Renaissance Sonnets_ [translated], _Ten
> Glasgow Sonnets_, and the complete sequence of _Sonnets From Scotland_,
> these texts providing an insight into the lexical underpinnings of the
> author's native land.
> A wider exploration of the nature of English lexical borrowings would entail
> reference to the recently published _Historical Thesaurus of English_ (OUP)
> in order to establish whether the borrowed terms either (a) replaced
> already-present native English terms or (b) extended the semantic scope of
> the language. As this resource is not available online, the author was
> unable to consult it, and leaves such a consideration to future generations
> of scholars who, unlike him, will be in receipt of proper financial support
> for their lexicographical endeavours.
> Notes
> [N1] To be precise, the borrowings fall entirely within the chronological
> range 1275-1386. As this period begins roughly 200 years after the
> Anglo-Norman assumption of power in England, and ends 223 years before the
> publication of Shakespeare's Sonnets, it would seem likely that these
> borrowings are the result of a long and sustained campaign by the English
> government to transform the language into a condition suitable for the
> publication of Parliamentary Statutes in English rather than Anglo-Norman,
> an event which occurred in the course of the reign of Henry VII. While
> proffered merely as a hypothesis, such a conclusion would seem to be
> consonant with the material here presented.
> [N2] This absence should be qualified by the observation that the majority
> of the Old French words found in the lexical corpus examined are themselves
> derived from Latin. However, as the preponderance of Old English words in
> the corpus are similarly derived from Common Germanic, or have Germanic
> cognates, it seems legitimate to disregard this aspect of Secondary Latin
> Borrowing. A proper consideration of this aspect of the materials could be
> conducted by means of a recourse to Pokorny's _Indogermanisches
> etymologisches Woerterbuch_
> (
> sename=\data\ie\pokorny).
> [N3] It might be argued that a consultation of the Ann Arbor Dictionary of
> Medieval English [] and the Dictionary of
> the Older Scottish Tongue [] would have been
> productive. This, due to constraints of time and failing eyesight, the
> author was unable to undertake.
> [N4] In the case of one of these OE words, "of", there is a degree of
> ambiguity in that, while the word itself is native, the sense in which it is
> used in the corpus derives from OF. Regardless of the way in which we
> choose to consider the term "of", however, it does not markedly skew the
> overall percentage figures.
> Notes to Caveats and Further Considerations:
> *N1 In treating this sonnet by Wyatt, the text in the Egerton manuscript
> will of course be collated with that found in the Devonshire MS and
> _Tottel's Miscellany_. Reference to the Arundel MS would seem to be
> nugatory, since this derives directly from Egerton, and, as is well known,
> the Blage MS is silent with regard to this particular text.
> *N2 It might be argued that the choice of that sonnet is vitiated by
> Brooke's clear dependence on an earlier text by Thomas Hardy, "Drummer
> Hodge". Please to note that the author of this study has taken this issue
> into account, but has chosen, for reasons sufficient unto himself, to
> disregard it.
> R.W.Hamilton
> (M.A. [Glasgow]; D.Phil. [York]),
> Professor Emeritus
> University of 'Pataphysics, Cockaigne.
> ------------------------------------------------------------
> The American Dialect Society -
> ------------------------------------------------------------
> The American Dialect Society -
Hotmail: Trusted email with Microsoft's powerful SPAM protection.

The American Dialect Society -

More information about the Ads-l mailing list