[Corpora-List] Universal POS Tagset

Serge Sharoff S.Sharoff at leeds.ac.uk
Mon Feb 2 13:53:08 UTC 2009


Another research project with similar goals is MTE:
http://nl.ijs.si/ME/V3/msd/html/

For a recent experiment on designing a tagset following this framework take a look at:
Serge Sharoff, Mikhail Kopotev, Tomaz Erjavec, Anna Feldman, and Dagmar Divjak. Designing and evaluating a Russian tagset. In Proceedings of the Sixth Language Resources and Evaluation Conference, LREC 2008, Marrakech, 2008.
http://corpus.leeds.ac.uk/mocky/lrec2008-msd.pdf

Serge


-----Original Message-----
From: corpora-bounces at uib.no on behalf of Adam Teichert
Sent: Fri 30/01/2009 20:53
To: corpora at uib.no
Subject: [Corpora-List] Universal POS Tagset
 
Hello all.


  I've been looking for a POS tagset that is general enough to
effectively tag "any" natural language.  (I'm looking at Linguistic
Typology / Universal Implications so I want to compare POS taggings
across many [possibly obscure] languages.) Does anyone know of such a
tagset?

  If anyone is interested in what I've found so far, this paper seems relevant:
    "Induction of Fine-grained Part-of-speech Taggers via Classifier
Combination and Crosslingual Projection" (Elliott Franco Dr´abek,
David Yarowsky)
    http://acl.ldc.upenn.edu/W/W05/W05-0807.pdf

  Also, I'm aware of some efforts at Microsoft Research India, to
perhaps develop a "universal" tagset for Indian Languages:
    http://research.microsoft.com/en-us/groups/mls/default.aspx


  Thanks for any ideas.

  --Adam (R. Teichert)

   MS Student
   School of Computing
   University of Utah

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list