[Corpora-List] About Part of Speech in English and Chinese

Linas Vepstas linasvepstas at gmail.com
Mon Nov 2 16:12:28 UTC 2009


2009/11/2 Linas Vepstas <linasvepstas at gmail.com>:
> 2009/11/2 Mike Maxwell <maxwell at umiacs.umd.edu>:
>> a rule like
>>   NP --> Det Adj* N+
>> (that's a flat version; one might have intermediate levels of structure).
>>  This analysis would account for the following distinction:
>>   a tall church tower
>>  *a church tall tower

I failed to make my intended point.  Suppose, for a moment, that
almost all churches had an architectural component called a
"tall tower", so that guidebooks and architectural digests might
sometimes talk about the "tall tower" of a church.  Were this the case,
then it would be just fine to talk about "a church tall tower", because
everyone would know that a "tall tower" was the primary semantic
entity, and so "church" would simply be a _nn noun-modifier to this
semantic entity.  So then, in this example, one could validly say
"church tall tower"  in those cases where one had to distinguish
between  the "tall tower" of a church, and the "tall tower" of something
else.

That my artificial example occurs in real life is witnessed in biomedical
literature, where there are many "architectural" structures having
names in the form of "adj-noun", and must then be further refined
by using additional modifiers, which may be nouns or adjectives.
Examples below.

> We extracted human umbilical vein endothelial cells.
> We extracted smooth muscle myosin heavy chain protein.
> We extracted peripheral blood mononuclear cells.
> It is located on the nuclear envelope inner membrane.
> We extracted simian virus large T-antigen.
> We collected HTLV-I infected T-cells.

To correctly parse these sentences, one must isolate the primary
or dominant semantic entity (e.g. "endothelial cells") and then
search for modifiers (e.g. "human umbilical vein")  Writing rules
for this is not easy.

--linas

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list