[Corpora-List] Natural Language Toolkit: NLTK-Lite version 0.6.4 released

Steven Bird sb at csse.unimelb.edu.au
Sat Apr 22 00:53:43 UTC 2006


NLTK, the Natural Language Toolkit, is a suite of Python libraries and
programs for symbolic and statistical natural language processing.

Version 0.6.4 has been released, and can be downloaded from
http://nltk.sourceforge.net/

CONTENTS

Software Modules:  corpus readers, tokenizers & stemmers, taggers
(regexp, n-gram, backoff, Brill, HMM), parsers (recursive descent,
shift-reduce, chart, probabilistic, ...), clusterers (EM, k-means,
...), probability distributions, chatbots, demonstrations, ...

Corpora and Corpus Samples: Brown Corpus, CMU Pronunciation
Dictionary, CoNNL-2000, Genesis, Gutenberg, IEER, Presidential
Addresses, Names, PP-Attachment, Senseval 2, TIMIT, Treebank, Words

Documentation: Tutorials and exercises (161pp), API documentation for
all software modules, installation instructions for Windows, Mac,
Unix.

-Steven Bird



More information about the Corpora mailing list