[Corpora-List] Simple POS tagger

David L. Hoover david.hoover at verizon.net
Mon Oct 18 18:10:37 UTC 2004


I need a simple POS tagger (preferably freeware) for a modest corpus of
contemporary American Poetry (total corpus is about 1,500,000 words, but
the samples are mostly under 100,000 words, and I would be happy with a
program that could handle even only much smaller samples, say 10,000 words.

I am mainly interested in noun and verb statistics, and do not need to
process the tags further or to use the tagged text in any other way.
Basically, I want to determine the percentage of the text tokens that
are in the various word classes.

I'm working with a fairly robust Windows XP Professional computer, and
would prefer something that won't take a lot of extra
installation/configuration work.

I have done some research, but there are so many choices it is difficult
to know where to start.

Any favorites?

Thanks,
David Hoover

--
David L. Hoover, Director of Undergraduate Studies & Webmaster
           NYU English Department, 212-998-8832
          http://www.nyu.edu/gsas/dept/english/

"If you pick up a starving dog and make him prosperous,
 he will not bite you. This is the principal difference
 between a dog and a man."
 -- Mark Twain, Pudd'nhead Wilson's Calendar



More information about the Corpora mailing list