[Corpora-List] About Part of Speech in English and Chinese

Mike Maxwell maxwell at umiacs.umd.edu
Mon Nov 2 16:51:47 UTC 2009


Linas Vepstas wrote:
> 2009/11/2 Mike Maxwell <maxwell at umiacs.umd.edu>:
>> a rule like
>>   NP --> Det Adj* N+
>> (that's a flat version; one might have intermediate levels of structure).
>>  This analysis would account for the following distinction:
>>   a tall church tower
>>  *a church tall tower
> 
> FYI, biomedical lit has some truly horrid noun-adj-noun-adj-noun chains:
> 
> We extracted human umbilical vein endothelial cells.
> We extracted smooth muscle myosin heavy chain protein.
> We extracted peripheral blood mononuclear cells.
> It is located on the nuclear envelope inner membrane.
> We extracted simian virus large T-antigen.
> We collected HTLV-I infected T-cells.

Interesting.  I wonder what is going on here?  Why does "church tall 
tower" sound so bad, and these seem OK?  Is biomedical lit the only 
domain where this happens?  (I doubt it, but it would be interesting to 
know about other domains.)  I suspect the Adj-Noun compounds (if that's 
what they are) are lexicalized, rather like 'blackboard.'

The last example strikes me as having a different bracketing than the 
others, namely
    [[HTLV-I infected] T-cells]
maybe like
    [a [three times defeated] team]
although the "three times" is surely an adverbial modifier, while 
'HLTV-I' is either the subject of the transitive 'infected', or the 
agent of the passive participle 'infected'.

BTW, these are the sort of thing that theoretical linguists (like 
generativists, but also more traditionally oriented linguists) spend 
lots of time thinking about.  I would suspect there's quite a literature 
on this, by people like Quirk, Greenbaum, Leech and Svartvik, or Pullum 
and Huddleston.
-- 
    Mike Maxwell
    What good is a universe without somebody around to look at it?
    --Robert Dicke, Princeton physicist

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list