[Lexicog] Corpus Conundrum #1
bolstar1
bolstar1 at YAHOO.COM
Sat Jul 7 16:19:21 UTC 2007
Which is Corpus Conundrum
Corpus Conundrus #1
Fellow word-people. I want to mention a point about the values of,
and difficulties of, using corpora, as it relates to punctuation.
First, I want to make it clear that I am not a computational
linguist, nor a strictly computational lexicographer. I use my
database/corpus for screening of particular forms and functions, and
it suits my purposes, but it is not built on the same paradigms as,
say, the Brown Corpus, which is a wonderful subset of the written
English language. Therefore, I don't purport to know all the
conundrums, complexities, and constraints of corpora in general. And
I don't have immediate access to a general comparative corpus.
(Point of example for need of common-comma use. Back in the
eighteen-nineties a new law was passed in the States that read "All
foreign fruit-plants are free from duty." Clear enough -- until a
legislative clerk, transcribing the law, thought less of the hyphen
than the comma, changing the wording to "All foreign fruit, plants
are free from duty." This comma, of course, was used as a synonym
punctuation point for "and" not thinking of a hyphenated noun. It
only cost American taxpayers $2,000,000, but in those days
$2,000,000 was $2,000,000.
This point has to do with one form, and one example of that the
one form, that stands out to me. And that is the comma. What
punctuation point is more often misused (or rather "variantly used") -
- if for no other reason than the others [colon; semicolon; dash;
ellipsis] are used so less frequently, that they don't have the
opportunity of being variantly used (I used "variantly" as opposed
to "variously" because within the 25+ ways to use a
comma "correctly," there are many variations (legitimate) within
those ways.
To refine the example to one adverbial lexeme, I'll use "too,"
especially in reference to its meaning "in addition to"
and "furthermore." Its other meaning as an adjective intensifier
as well as its homonyms (homophones/heteronyms) "to" and "two" will
be used as subset examples.
In its meaning of "in addition to"/"furthermore" "too" it is
commonly seen without the pre-positioned comma (in its sentence-
ending or phrasal-ending position)-- not in its mid-sentence
position, which almost always gets the comma. This is perhaps more an
American style than a British one. I was taught, and rightly so I
believe, to separate this use of "too" from all its surrounding
words, not just in mid-sentence. Whether it is proper or not,
standardization in open prose would be a a positive development.
I will use Shakespeare Editions to go by here. General corpora
that compares American writing (not necessarily transcribed speech)
to British, Australian, New Zealand, S. African, etc. writing, could
determine a true statistical variation.
Specifically, in "A Midsummer Night's Dream, out of the nine
times Shakespeare uses the form "
, too" with the meaning
of "additionally"/"moreover" both Riverside and Signet omit the pre-
positioned comma (mid-sentence or not), and the Folger Edition
(American-based Shakespeare institution) splits it seven to two, in
favor of omitting the comma. It is true that in the Riverside Edition
G. Blakemore Evans, in his preface "Shakespeare's Text", suggests
that as an editorial principle, in-so-far as editing Shakespeare
goes, "who ignores the punctuation of the copy-text [original Folio
or quartos], does so at the risk of continual damage to the movement
and frequently to the meaning of the lines, either verse or prose."
Interestingly, in that sub-context, he referred directly to Samuel
Johnson's over-use of punctuation in editing Shakespeare, "who feels,
as Dr. Johnson did, that punctuation is entirely in his power."
A couple of insults from Shakey using the adverbial lexeme
meaning "additionally"/"moreover" --
A Midsummer Night's Dream 3.02.149-150
Can you not hate me, as I know you do,
But you must join in souls to mock me too?
A Midsummer Night's Dream 3.02.312-313
But he hath chid me hence, and threatened me
To strike me, spurn me, nay, to kill me too.
As You Like It 3.05.043(2)-044
'Od's my little life,
I think she means to tangle my eyes too!
'Od's* God save (Sig) ||
tangle* ensnare (Riv)
Cymbeline 2.01.024-025
SECOND LORD: [Aside.]
You are cock and capon too,
And you crow cock, with your comb on.
and* if (Sig)
A couple of insults from Shakey using the adjective lexeme
meaning "overly"
All's Well That Ends Well 2.03.213-215(1)
LAFEW:
Do not plunge thyself too far in anger, lest
Thou hasten thy trial; which if -- Lord have mercy on
Thee for a hen!
hasten thy trial* be (tested and) found out sooner (Riv)
hen* coward
All's Well that Ends Well 2.03.097-098
HELENA [To FOURTH LORD.]
You are too young, too happy, and too good,
To make yourself a son out of my blood.
Much Ado About Nothing 1.01.167-169(1)
BENEDICK:
Why, i' faith, methinks she's too low for a
High praise, too brown for a fair praise, and too little
For a great praise.
low* short (Riv)
fair* plainly to be seen; distinct (Oni)
Romeo and Juliet 2.02.118-120
It is too rash, too unadvised, too sudden;
Too like the lightning, which doth cease to be
Ere one can say it lightens.
Henry VIII 5.02.108-109(1)
CROMWELL:
My Lord of Winchester, y' are a little,
By your good favor, too sharp;
Julius Caesar 3.01.077
Shakespeare coinage
CAESAR:
Et tu, Brute? Then fall Caesar. [Dies.]
Love's Labor's Lost 5.01.013-014
He is too picked, too spruce, too affected, too odd,
as it were, too peregrinate, as I may call it.
picked* refined (Sig) || fastidious (Riv)
peregrinate* foreign in manner (Sig) || foreign (Riv)
Richard III 3.04.080
For I, too fond, might have prevented this.
fond* foolish (Sig) || (Riv)
Sonnet 038.01-04
How can my Muse want subject to invent,
While thou dost breathe, that pour'st into my verse
Thine own sweet argument, too excellent
For every vulgar paper to rehearse?
Romeo and Juliet 2.02.118-120 Romeo and Juliet 2.02.119 Romeo
and Juliet 2.02.120
It is too rash, too unadvised, too sudden;
Too like the lightning, which doth cease to be
Ere one can say it lightens.
But the point is, when determining frequency of use, higher-
frequency generally gives way to lesser-frequency. On the other hand,
most general-use writing textbooks opt for the comma. Using/not using
the comma can have interesting consequences.
As to the difficulty of using transcribed corpora, particularly
when words & phrases & sentences & contexts not being individually,
meticulously screened and tagged for meaning, the following examples
can ensue. Some are silly, some rarely being conceived, some
hyperbolic - they are all possible uses of the homonym "too" in
written form, as well as real speech [vernacular speech].
-- Here are a few jokes and one-liners, with obvious implications
for differentiation and to show that the "
, too" formulation, per
se, is not in itself a qualifying punctuational lexeme
for "additionally"/"moreover".
Some show why transcribing from spoken speech forms need a heavy
hand in place punctuation, in this case commas, in the right places.
Notice the ambiguous reduplicative forms (without a comma) that could
otherwise ensue:
The secret of success is to always take advantage of your
opportunities, and other people's, too.
too little, too late
When we are young, we change our opinions too often; when we are
old, too seldom.
Bart: Hey, Gary, I completed the jigsaw puzzle, and it only took me 14
months!
Gary: Gee, are you slow. Anyone can do a jigsaw puzzle in less than 14
months.
Bart: Not true...the box said "5 to 8 years".
Eric: I used to be a parking lot attendant, but my driving skills were
atrocious. Then I tried professional bowling, but I had to quit,
too.
Jay: Why?
Eric: I went down all the wrong alleys.
I went there, too. ,
;
--
late, but there.
I went there too late.
I went there, too too late.
I went there too, too late.
I went there too, later.
I went there too, to two-time tutu-wearing toodles.
Hey, toots!
Have fun with these, as I did in compiling them.
NOTE: If anyone has a handle on comparative stats of Brit./Am. uses
of the pre-positioned comma, please respond.
Scott Nelson
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/lexicographylist/
<*> Your email settings:
Individual Email | Traditional
<*> To change settings online go to:
http://groups.yahoo.com/group/lexicographylist/join
(Yahoo! ID required)
<*> To change settings via email:
mailto:lexicographylist-digest at yahoogroups.com
mailto:lexicographylist-fullfeatured at yahoogroups.com
<*> To unsubscribe from this group, send an email to:
lexicographylist-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Lexicography
mailing list