[Corpora-List] About Part of Speech in English and Chinese

Adam Kilgarriff adam at lexmasterclass.com
Mon Nov 2 10:06:00 UTC 2009


To add to Mike's response, my particular bugbears are not only noun/adj
ambiguities like chief (and many others; male, female, gold silver, ...) but
also past-participles/adjectives, and, worst of all, -ing forms, which can
float between nouns, verbs and adjectives in a most licentious manner (and
if they modify another word, you don't know if the underlying relationship
is SUBJ or OBJ, as in Chomsky's "visiting relatives")

They are the cause of a lot of the noise in what we do for English

Adam

2009/11/2 Mike Scott <mike at lexically.net>

>  I think there are two different aspects here. One is that as linguistic
> categories aren't well established, POS categories won't be either since
> they derive ultimately from linguistic theory. If we take cases like
> (1) church tower
> (2) tall tower
> it is clear that (2) is adjectival, but in the case of (1) some linguistic
> theories will call church a noun (because that word-form arguably is mainly
> used for nouns) while others would call it an adjective because it is here
> premodifying a noun. The former theories seem to act as if word-forms had a
> primary POS, rather as people have their gender determined before birth,
> while latter theories allow for the possibility that words may swing both
> ways, so to speak, depending on the company they keep.
>
> The second aspect concerns the information supplied in the context or
> inferable from it. In the case of (3) ... chief distribution ...
> English simply does not tell us without more context whether we are talking
> of the way chiefs (e.g. tribal chiefs) are distributed through a population
> or territory, or whether we are talking of the main patterns of distribution
> of something. Either way, chief premodifies distribution. In POS tagging for
> such a case, the context may or may not disambiguate so POS tagging will
> necessarily, for those linguists who think word-forms have a predetermined
> POS, be varied.
>
> Cheers -- Mike
>
> Fukun Xing wrote:
>
> Hi everybody,
>    I am puzzled with the part of speech of "chief" in the phrase "the chief
> executive officer". In the Penn Treebank "chief" in the phrase sometimes is
> tagged as "JJ" and sometimes tagged as "NN". Could you tell me how you will
> tag it and why. And is it safe to say that there are some PoS ambiguities,
> which can not even be solved by human, in English. I know that it maybe true
> in Chinese that sometimes it is impossible for human to decide the right pos
> of some words. For example, "一件 包装/v n 精美 的 礼品" (1. a present with wonderful
> decoration. 2. a prsent decorated wonderfully)In this sentence
> "包装"(decorate/decoration) can be tagged as noun or verb, both are right,
> which cannot affected right understanding of the sentence. If there is such
> thing in English can you give some examples?
>  Thanks in advance!
>
> Xing
>
> ------------------------------
>
> _______________________________________________
> Corpora mailing listCorpora at uib.nohttp://mailman.uib.no/listinfo/corpora
>
>
> --
> Mike Scott
>
> ***
> If you publish research which uses WordSmith, do let me know so I can include it athttp://www.lexically.net/wordsmith/corpus_linguistics_links/papers_using_wordsmith.htm
> ***
> University of Aston and Lexical Analysis Software Ltd.mike.scott at aston.ac.ukwww.lexically.net
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
================================================
Adam Kilgarriff
http://www.kilgarriff.co.uk
Lexical Computing Ltd                   http://www.sketchengine.co.uk
Lexicography MasterClass Ltd      http://www.lexmasterclass.com
Universities of Leeds and Sussex       adam at lexmasterclass.com
================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091102/f6b138f6/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list