[Corpora-List] About Part of Speech in English and Chinese
Linas Vepstas
linasvepstas at gmail.com
Mon Nov 2 16:12:28 UTC 2009
2009/11/2 Linas Vepstas <linasvepstas at gmail.com>:
> 2009/11/2 Mike Maxwell <maxwell at umiacs.umd.edu>:
>> a rule like
>> NP --> Det Adj* N+
>> (that's a flat version; one might have intermediate levels of structure).
>> This analysis would account for the following distinction:
>> a tall church tower
>> *a church tall tower
I failed to make my intended point. Suppose, for a moment, that
almost all churches had an architectural component called a
"tall tower", so that guidebooks and architectural digests might
sometimes talk about the "tall tower" of a church. Were this the case,
then it would be just fine to talk about "a church tall tower", because
everyone would know that a "tall tower" was the primary semantic
entity, and so "church" would simply be a _nn noun-modifier to this
semantic entity. So then, in this example, one could validly say
"church tall tower" in those cases where one had to distinguish
between the "tall tower" of a church, and the "tall tower" of something
else.
That my artificial example occurs in real life is witnessed in biomedical
literature, where there are many "architectural" structures having
names in the form of "adj-noun", and must then be further refined
by using additional modifiers, which may be nouns or adjectives.
Examples below.
> We extracted human umbilical vein endothelial cells.
> We extracted smooth muscle myosin heavy chain protein.
> We extracted peripheral blood mononuclear cells.
> It is located on the nuclear envelope inner membrane.
> We extracted simian virus large T-antigen.
> We collected HTLV-I infected T-cells.
To correctly parse these sentences, one must isolate the primary
or dominant semantic entity (e.g. "endothelial cells") and then
search for modifiers (e.g. "human umbilical vein") Writing rules
for this is not easy.
--linas
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list