[Corpora-List] Chinese and English POS

Michal Kren Michal.Kren at ff.cuni.cz
Tue Nov 3 14:33:14 UTC 2009


Let me also point out that parts of speech are originally very basic
distinctions designed for languages with rich inflection - Latin, Ancient
Greek, Sanskrit - where POS is (usually) an inherent morphological feature
of a lemma. Trying to carry them over to typologically different languages
like English or even Chinese inevitably brings a lot of difficulties, as
there is no morphological clue and determination of POS often requires (at
least partial) syntactic analysis. This is not to say that there are no
problems with POS for synthetic languages, there are lots of them, but
that's just it. As a devil's advocate, I could ask to what extent does POS
category in itself make sense for isolating languages? Why to label
different roles of the same word as parts of speech?

Regards,
Michal


> I don't believe it makes sense to look for a theory telling us what PoS a
> given word in a given context "really" is, for numerous examples such as
> those mentioned by Adam Kilgarriff in this thread.  I just don't see that
> there is a "truth of the matter" to which a theory may correspond or fail
> to correspond.  What one can do is to try to _impose_ rules that succeed
> in
> determining a unique PoS tagging in as many debatable cases as possible
> (and that are consistent with the linguistic consensus in clear cases);
> that's what I tried to do for English in my book "English for the
> Computer".  But I was always clear that I wasn't claiming to discover
> facts
> about English structure, only imposing a classification scheme on English.
> (It seemed to me that some of the contributors to this thread were not
> recognising this distinction, though others perhaps do.)
>
> Geoffrey Sampson
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list