[Lexicog] Corpus Conundrum #1

bolstar1 bolstar1 at YAHOO.COM
Sat Jul 7 16:19:21 UTC 2007


Which is Corpus Conundrum 

Corpus Conundrus #1 
 
Fellow word-people. I want to mention a point about the values of, 
and difficulties of, using corpora, as it relates to punctuation. 
First, I want to make it clear that I am not a computational 
linguist, nor a strictly computational lexicographer. I use my 
database/corpus for screening of particular forms and functions, and 
it suits my purposes, but it is not built on the same paradigms as, 
say, the Brown Corpus, which is a wonderful subset of the written 
English language. Therefore, I don't purport to know all the 
conundrums, complexities, and constraints of corpora in general. And 
I don't have immediate access to a general comparative corpus. 

     (Point of example for need of common-comma use. Back in the 
eighteen-nineties a new law was passed in the States that read "All 
foreign fruit-plants are free from duty." Clear enough -- until a 
legislative clerk, transcribing the law, thought less of the hyphen 
than the comma, changing the wording to "All foreign fruit, plants 
are free from duty." This comma, of course, was used as a synonym 
punctuation point for "and" – not thinking of a hyphenated noun. It 
only cost American taxpayers $2,000,000, but – in those days 
$2,000,000 was $2,000,000. 

    This point has to do with one form, and one example of that the 
one form, that stands out to me. And that is the comma. What 
punctuation point is more often misused (or rather "variantly used") -
- if for no other reason than the others [colon; semicolon; dash; 
ellipsis] are used so less frequently, that they don't have the 
opportunity of being variantly used (I used "variantly" as opposed 
to "variously" because within the 25+ ways to use a 
comma "correctly," there are many variations (legitimate) within 
those ways. 
     To refine the example to one adverbial lexeme, I'll use "too," 
especially in reference to its meaning "in addition to" 
and "furthermore." Its other meaning – as an adjective intensifier – 
as well as its homonyms (homophones/heteronyms) "to" and "two" will 
be used as subset examples.  
     In its meaning of "in addition to"/"furthermore" "too" it is 
commonly seen without the pre-positioned comma (in its sentence-
ending or phrasal-ending position)-- not in its mid-sentence 
position, which almost always gets the comma. This is perhaps more an 
American style than a British one. I was taught, and rightly so I 
believe, to separate this use of "too" from all its surrounding 
words, not just in mid-sentence. Whether it is proper or not, 
standardization in open prose would be a a positive development.
     I will use Shakespeare Editions to go by here. General corpora 
that compares American writing (not necessarily transcribed speech) 
to British, Australian, New Zealand, S. African, etc. writing, could 
determine a true statistical variation. 
     Specifically, in "A Midsummer Night's Dream, out of the nine 
times Shakespeare uses the form "
, too" with the meaning 
of "additionally"/"moreover" – both Riverside and Signet omit the pre-
positioned comma (mid-sentence or not), and the Folger Edition 
(American-based Shakespeare institution) splits it seven to two, in 
favor of omitting the comma. It is true that in the Riverside Edition 
G. Blakemore Evans, in his preface "Shakespeare's Text", suggests 
that as an editorial principle, in-so-far as editing Shakespeare 
goes, "who ignores the punctuation of the copy-text [original Folio 
or quartos], does so at the risk of continual damage to the movement 
and frequently to the meaning of the lines, either verse or prose." 
Interestingly, in that sub-context, he referred directly to Samuel 
Johnson's over-use of punctuation in editing Shakespeare, "who feels, 
as Dr. Johnson did, that punctuation is entirely in his power."  


     A couple of insults from Shakey using the adverbial lexeme 
meaning – "additionally"/"moreover" --  

A Midsummer Night's Dream 3.02.149-150
     Can you not hate me, as I know you do,
     But you must join in souls to mock me too?  

A Midsummer Night's Dream 3.02.312-313
     But he hath chid me hence, and threatened me
     To strike me, spurn me, nay, to kill me too.  

 As You Like It 3.05.043(2)-044
     'Od's my little life,
     I think she means to tangle my eyes too!  

     'Od's* God save  (Sig) || 
     tangle* ensnare  (Riv)

Cymbeline 2.01.024-025
     SECOND LORD: [Aside.]
     You are cock and capon too,
     And you crow cock, with your comb on. 

     and* if  (Sig)


     A couple of insults from Shakey using the adjective lexeme 
meaning – "overly" – 

All's Well That Ends Well 2.03.213-215(1)
     LAFEW:
     Do not plunge thyself too far in anger, lest
     Thou hasten thy trial; which if -- Lord have mercy on
     Thee for a hen!  

     hasten thy trial* be (tested and) found out sooner  (Riv)
     hen* coward 


All's Well that Ends Well 2.03.097-098
     HELENA [To FOURTH LORD.]
     You are too young, too happy, and too good,
     To make yourself a son out of my blood.  

Much Ado About Nothing 1.01.167-169(1)
     BENEDICK:
     Why, i' faith, methinks she's too low for a
     High praise, too brown for a fair praise, and too little
     For a great praise.  

     low* short  (Riv)
     fair* plainly to be seen; distinct  (Oni)


Romeo and Juliet 2.02.118-120  
     It is too rash, too unadvised, too sudden;
     Too like the lightning, which doth cease to be
     Ere one can say it lightens.  

     Henry VIII 5.02.108-109(1)
     CROMWELL:
     My Lord of Winchester, y' are a little,
     By your good favor, too sharp;  

     Julius Caesar 3.01.077
     Shakespeare coinage
     CAESAR:
     Et tu, Brute? Then fall Caesar.        [Dies.]  


     Love's Labor's Lost 5.01.013-014
     He is too picked, too spruce, too affected, too odd,
     as it were, too peregrinate, as I may call it.  

     picked* refined (Sig) || fastidious  (Riv)
     peregrinate* foreign in manner (Sig) || foreign  (Riv)


     Richard III 3.04.080
     For I, too fond, might have prevented this.  

     fond* foolish  (Sig) || (Riv)

     Sonnet 038.01-04
     How can my Muse want subject to invent,
     While thou dost breathe, that pour'st into my verse
     Thine own sweet argument, too excellent
     For every vulgar paper to rehearse? 

Romeo and Juliet 2.02.118-120   Romeo and Juliet 2.02.119   Romeo
     and Juliet 2.02.120
     It is too rash, too unadvised, too sudden;
     Too like the lightning, which doth cease to be
     Ere one can say it lightens. 


     But the point is, when determining frequency of use, higher-
frequency generally gives way to lesser-frequency. On the other hand, 
most general-use writing textbooks opt for the comma. Using/not using 
the comma can have interesting consequences.   
     
     As to the difficulty of using transcribed corpora, particularly 
when words & phrases & sentences & contexts not being individually, 
meticulously screened and tagged for meaning, the following examples 
can ensue. Some are silly, some rarely being conceived, some 
hyperbolic –- they are all possible uses of the homonym "too" in 
written form, as well as real speech [vernacular speech]. 

 -- Here are a few jokes and one-liners, with obvious implications 
for differentiation – and to show that the "
, too" formulation, per 
se, is not in itself a qualifying punctuational lexeme 
for "additionally"/"moreover".
 Some show why transcribing from spoken speech forms need a heavy 
hand in place punctuation, in this case commas, in the right places. 
Notice the ambiguous reduplicative forms (without a comma) that could 
otherwise ensue: 


The secret of success is to always take advantage of your 
opportunities, and other people's, too.

too little, too late

     When we are young, we change our opinions too often; when we are 
old, too seldom.


Bart: Hey, Gary, I completed the jigsaw puzzle, and it only took me 14
     months!
Gary: Gee, are you slow. Anyone can do a jigsaw puzzle in less than 14
     months.
Bart: Not true...the box said "5 to 8 years".

Eric: I used to be a parking lot attendant, but my driving skills were
     atrocious. Then I tried professional bowling, but I had to quit, 
      too.
Jay: Why?
Eric: I went down all the wrong alleys.


     I went there, too. ,
;
--
late, but there.
     I went there too late.
     I went there, too too late.
     I went there too, too late.
     I went there too, later.
     I went there too, to two-time tutu-wearing toodles.


Hey, toots!

Have fun with these, as I did in compiling them. 

NOTE: If anyone has a handle on comparative stats of Brit./Am. uses 
of the pre-positioned comma, please respond. 

Scott Nelson






 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> Your email settings:
    Individual Email | Traditional

<*> To change settings online go to:
    http://groups.yahoo.com/group/lexicographylist/join
    (Yahoo! ID required)

<*> To change settings via email:
    mailto:lexicographylist-digest at yahoogroups.com 
    mailto:lexicographylist-fullfeatured at yahoogroups.com

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the Lexicography mailing list