[Lexicog] Tagging and parsing

Patrick Hanks hanks at BBAW.DE
Tue May 30 11:51:05 UTC 2006


[from lexicographylist]

Ron Moe said:

> Is there any reason why we can't mark the semantic class of lexemes as
well
> as the morphological or syntactic class?

Gosh, Ron, this is the $64 million question in lexical research, as far as I
am concerned.

A first reason is the prevalence of reductionism in taggers and parsers,
which would be very hard to get away from, I believe. Currently what happens
is this: the tagger finds word boundaries and tags each word with a p-o-s
label, then the parser comes along and groups the tagged words into phrases.
A semantic tagger would need to treat some phrases as equivalent to words,
and would start by identifying clause roles and valencies, not word classes
and tree structures. This may imply a radical shift in theoretical
foundations for many computational lingusits, Thus, for the verb "treat"
(and possibly others) -- we need to classify "with respect" and
"respectfully" together, and distinguish them from "with antibiotics" and
"chemotherapeutically". What actually happens in present-day parsing and
tagging is the very reverse: the single-word items ending in -ly are tagged
as /ADVERB, while the two-word items are left to parsers, which find the PPs
and leave it at that.

A second reason is that we don't know what the (useful) semantic classes
are. Semantic roles like "Agent" and "Patient" don't seem to help much when
it comes to distinguishing one meaning from another.  Jackendoff's "basic
semantic parts of speech" -- Human, Physical Object,  Event, etc. -- are a
more promising start but quickly run into difficulties. (I could go on.)

A third reason is that, as lexicographers, we don't know a priori what level
of generality is needed. The verb "hazard", in its usual modern meaning, is
very specific: it takes only one lexical item as its normal direct object,
namely, "a guess". (If I talk about hazarding a definition or an
explanation, I'm implying that the definition or explanation is really only
a sort of guess.) This sense of "hazard" can then be distinguished from a
much rarer one -- a synonym of "risk" -- in which one hazards some valued
object.  To take another example, for some purposes it is useful to
distinguish [[Firearm]] from [[Projectile]], while for other purposes "gun"
and "bullet" can be lumped together as [[Artefact]]. Thus, the distinction
between firing a gun and firing a bullet is important if you want to know
what it was that moved through the air and (maybe) hit the target, but not
if you only want to distinguish the firearms sense from "firing an
employee".

Thanks for provoking me into setting out my thoughts on this.

Patrick


----- Original Message ----- 
From: "Ron Moe" <ron_moe at sil.org>
To: <lexicographylist at yahoogroups.com>
Sent: Monday, May 29, 2006 9:15 PM
Subject: RE: [Lexicog] Slots and slot fillers (nee "Nouns")


> Is there any reason why we can't mark the semantic class of lexemes as
well
> as the morphological or syntactic class? If a parser can look at
neighboring
> words and note the syntactic class of those words, is there any reason why
> it can't note the semantic class as well? We have ways of marking the
> semantic class of lexemes. My list of semantic domains is an attempt in
that
> direction. I see no reason why we can't hope to do what Patrick asks. But
> designing parsers is beyond my competency. Does anyone with a knowledge of
> parsers have an answer?
>
> Ron Moe
>
> -----Original Message-----
> From: lexicographylist at yahoogroups.com
> [mailto:lexicographylist at yahoogroups.com]On Behalf Of Patrick Hanks
> Sent: Monday, May 29, 2006 7:25 AM
> To: lexicographylist at yahoogroups.com
> Subject: Re: [Lexicog] Slots and slot fillers (nee "Nouns")
>
>
>
> Thanks, Rudy.  Very instructive.
>
> When doing corpus-based analysis of verb meaning and use in English, I'd
> love to have a semantically driven parser that could distinguish
adverbials
> of manner/attitude from instrumental adverbials, regardless of the number
of
> words involved in each. This is because the type of adverbial can
sometimes
> affect the meaning of the verb, thus:
>
> treat someone {with respect / respectfully}   = ATTITUDE
> treat someone (with chemotherapy/chemotherapeutically) = MEDICAL
>
> -- where the number of words in the adverbial is immaterial and its
semantic
> value is what matters.  But I suppose that is too much to hope for.
>
> Ah well, back to the grindstone.
>
> Patrick
>
>
>
> ----- Original Message -----
> From: <rtroike at email.arizona.edu>
> To: <lexicographylist at yahoogroups.com>
> Sent: Monday, May 29, 2006 10:34 AM
> Subject: [Lexicog] Slots and slot fillers (nee "Nouns")
>
>
> >
> > Whether a phonological sequence is a "word" or a "phrase" is sometimes
in
> > the eye of the beholder, or depends on the structure of the language
> involved.
> > In English, we write prepositions with a space before and after them,
but
> in
> > Turkish (and most SOV languages), what corresponds semantically (per Ron
> Moe)
> > is placed at the end of the phonological sequence, and is generally
called
> > a "case suffix" or sometimes, if written with a space, a "postposition".
> In
> > English, the GENITIVE marker is written solid with what precedes (more
> later),
> > albeit with an apostrophe (-'s) [due to the false notion that arose in
the
> > 17th century that this was a contraction of "his" -- it was not so
written
> > earlier nor is it in any other Germanic language] when the Genitive
> expression
> > (single or multiple words) _precedes_ the head Noun, but separately, as
> "of",
> > when the Genitive expression _follows_ the Noun.
> >
> > (The "of", being weakly stressed, may encliticize to the Noun, and be
> written
> > solid with it, as "cup of coffee" becomes "cuppa coffee". -- As a
> digression,
> > this creates a problem grammatically and lexicographically, as the {GEN}
> > morpheme, in its allomorphic form "of", is detached phonologically from
> the
> > NP it is connected to grammatically; non-native speakers, encountering
> > "cuppa" in print, may wonder what it is and look for it in a dictionary
> [the
> > same problem, from different sources, applies to common orthographic
forms
> > "hafta", "wanna", "gotta", and "coulda"]).
> >
> > Charles Fries documented the fact that in Old English, the "Saxon
> Genitive"
> > was used 95% of the time and the "Romance Genitive" 5% of the time. By
the
> > 18th century this had reversed to 5% and 95%, respectively. Similarly,
the
> > suffixed Genitive of Latin was replaced by a preposition "de/di" in the
> > modern Romance languages. It is clear, then, both from cross-linguistic
> > comparisons as well as from internal histories, that at the level of
> semantic
> > structure, the suffix and the preposition constitute the same linguistic
> > element, with different surface realizations based on positional
> differences
> > in surface structure.
> >
> > Structuralists like Bloomfield, Bloch, Fries, Hill, Hockett, Pike, and
> Trager
> > all recognized the hierarchical difference between the syntactic
position
> > and its filler(s). A common example of the time was the use of the
single
> > word "fire" as a complete utterance, which ambiguously could be taken as
> > an imperative of a Verb, ordering guns to be shot, or as an elliptical
> > existential, alerting hearers to the (possible) presence of a fire. In
the
> > first instance, "Fire!" was seen as filling the following hiearchical
> slots:
> >
> >        Sentence
> >           |
> >        Predicate
> >           |
> >          Verb
> >           |
> >         Fire
> >
> > To say that "fire" is merely a word, and nothing more, would be to miss
> the
> > whole significance of its use, and any valid grammar of English would
have
> > to account for that. (I am reminded here of the distress expressed by a
> > fellow evening-class student in my Beginning Chinese course some years
> ago,
> > when the instructor mentioned that some word had two possible meanings,
> > when she insistently repeated her concern that she would not be able to
> > tell which meaning was intended. The wise instructor finally, after a
> > number of vain attempts to quiet her concern by illustrating contrasting
> > contexts in which the different meanings would be deployed, finally
> uttered
> > the memorable observation, "Madam, words do not normally occur outside
of
> > sentences".)
> >
> > As for the concern about the difference between words and phrases (apart
> > from the suffix~postposition/preposition example), the labels NP, VP,
DP,
> > NumP, TP, etc., although usually verbalized as "noun phrase", etc., do
> > not mean "more than one word", but rather are simple formal designations
> > of slots within a hierarchical system. (Thanks to X-bar 'theory', the
> > exact significance of these labels has changed from the original
> > Transformational-Generative grammar, but that is a technical matter.)
> >
> > No one I know would argue that "at this moment" is not a phrase, and
> > "now" is not a word, but in the following sentences, all would agree
> > that they are filling the same slot, AdvP-time (or some similar
> designation):
> >
> >          He is leaving at this moment.
> >          He is leaving now.
> >
> > Structural linguists were very much at pains to try to distinguish
> > terminologically between word-level category labels and slot-category
> labels.
> > Thus for them, both of these would be "Adverbials", but only "now" would
> be
> > an "Adverb". (Since morphological features were used to define
> word-classes,
> > "pretty" would be an "Adjective", because it could take the suffixes
"-er"
> > and "-est", but "beautiful" could not be an "Adjective", though it could
> be
> > classified as an "Adjectival" because it could be compared
> periphrastically
> > by the use of "more" and "most".) Rigorous methodological purity was a
> > touchstone among many, perhaps most, structuralists.
> >
> >      Rudy Troike
> >
> >
> >
> >
> >
> >
> >
> >
> > Yahoo! Groups Links
> >
> >
> >
> >
> >
> >
> >
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>



------------------------ Yahoo! Groups Sponsor --------------------~--> 
Everything you need is one click away.  Make Yahoo! your home page now.
http://us.click.yahoo.com/AHchtC/4FxNAA/yQLSAA/HKE4lB/TM
--------------------------------------------------------------------~-> 

 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the Lexicography mailing list