[Corpora-List] Simple POS tagger
David L. Hoover
david.hoover at verizon.net
Mon Oct 18 18:10:37 UTC 2004
I need a simple POS tagger (preferably freeware) for a modest corpus of
contemporary American Poetry (total corpus is about 1,500,000 words, but
the samples are mostly under 100,000 words, and I would be happy with a
program that could handle even only much smaller samples, say 10,000 words.
I am mainly interested in noun and verb statistics, and do not need to
process the tags further or to use the tagged text in any other way.
Basically, I want to determine the percentage of the text tokens that
are in the various word classes.
I'm working with a fairly robust Windows XP Professional computer, and
would prefer something that won't take a lot of extra
installation/configuration work.
I have done some research, but there are so many choices it is difficult
to know where to start.
Any favorites?
Thanks,
David Hoover
--
David L. Hoover, Director of Undergraduate Studies & Webmaster
NYU English Department, 212-998-8832
http://www.nyu.edu/gsas/dept/english/
"If you pick up a starving dog and make him prosperous,
he will not bite you. This is the principal difference
between a dog and a man."
-- Mark Twain, Pudd'nhead Wilson's Calendar
More information about the Corpora
mailing list