[Corpora-List] Current state of the art of POS tagging/evaluation?

Fri May 4 08:38:45 UTC 2007

At 16:22 3-5-2007, Orion Buckminster Montoya wrote:

>We would appreciate pointers for any of the following:
>         * papers describing comparative evaluation exercises

See
   Hans van Halteren, Jakub Zavrel, Walter Daelemans:
   Improving Accuracy in NLP Through Combination of Machine
   Learning Systems. Computational Linguistics 27(2): 199-229 (2001)

You should probaly also have a look at chapters 6 and 7 of
   Hans van Halteren (ed.), Syntactic Wordclass Tagging, Kluwer, 1999
which deal with evaluation (6) and choosing a tagger for your own use (7).

>Since tagger performance, for many taggers, depends on the quality and
>volume of training text, we'd also appreciate pointers on how that can
>be brought in to the evaluation, to give us a good idea of which
>tagger will perform best on our dataset.

Do you want to evaluate taggers or tagger generators?

>         * other taggers that we should consider

I have taggers trained on written/spoken material from the BNC sampler.
And the tagger generator with which I made them. For a description, see

   H. van Halteren, The Detection of Inconsistency in Manually Tagged Text,
   Proc. Worshop on Linguistically Interpreted Corpora 2000 (LINC 2000), 2000

>I would be particularly pleased to find a top-quality tagger with
>freely modifiable source code.

Running the tagger or generator in a comparison is no problem (if you
have access to Linux). For making the source code available, I'd have
to discuss things with the department here.

>         * data to use as 'gold standard': we are aware of the BNC
>sampler and the Penn TreeBank, though we are also aware of the roles
>these datasets have played as training and development material, for
>various taggers.  The OEC is web-sourced and covers a wide range of
>text types so ideally we shall evaluate it on a dataset like that.

Sounds like you'll have to take a (representative) sample from your own
corpus and tag it by hand. Did you decide on a tagset yet?

Keep us informed,
Hans van Halteren