[Corpora-List] About Part of Speech in English and Chinese

Linas Vepstas linasvepstas at gmail.com
Mon Nov 2 16:46:39 UTC 2009


2009/11/2 Taras Zagibalov <T.Zagibalov at sussex.ac.uk>:
> The language is not static and it is not possible to assign a POS to
> any sequence of characters constituting a word (whatever it means).
> Nominalisation, polysemy, homonymy and other "semantic fluctuations"
> will always make it difficult to attribute a word to a POS 'once and
> forever'. Meaning is mostly contextual, only abstract part of it can
> be stored in a dictionary (and POS is the most abstract layer of a
> word's semantics), but such an abstract meaning does not exist 'on its
> own' but only in contexts.

Heh.  I beg to differ.   I believe that the difficulties of POS w.r.t. parsing
can largely go away when one realizes that there are actually many
thousands of "fine-grained POS" types that can be assigned to words,
and that these "fine-grained-POS" types can significantly control
syntactic interactions. -- and more -- fine-grained POS tags can account
for a *great majority* of all syntactic phenomena in English.

An example. I currently maintain the open-source Link Grammar parser.
It's based on a theory of "links" between words, which are enforced by
means of the "connectors" a word can carry.  For example, the connectors
on a word might say something like "this word can only be used if there
is a determiner on the right, and a verb on the left".  Its not hard to see
that the above connector list (det-on-right, verb-on-left) is a kind-of noun.
Link-grammar has about a hundred different connector types, and a
dictionary where words can get many thousands of different combinations
of these connectors.  It is entirely appropriate to think of a connector set
as a kind of "very fine-grained POS tag", and that, once we've assigned
these fine-grained tags to words, one can account for almost all syntactic
structure.

Another example: most prepositions have the connectors "MV- J+",
where "MV-" connects to verbs on the left, and "J+" connects to
prepositional objects on the right. Thus, "MV- J+" is an example of a
"fine-grained POS tag" in Link Grammar.

This is very different from the usual NP, VP style context-free grammar
rules -- there are no NP or VP or any other "chunks" or production rules:
there are only bare-naked words, and their fine-grained POS tags, and
this alone is sufficient to account for most of the structure of the English
language.

-- linas

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list