[Corpora-List] ngram statistics package version 0.53
ted pedersen
tpederse at d.umn.edu
Wed Jan 15 04:49:55 UTC 2003
We are happy to announce an updated version of the Ngram Statistics
Package (NSP). The most current version is now v-0.53.
NSP is a suite of Perl programs that allow users to identify interesting
Ngrams in text using a variety of measures of association. It's free
software, just go to : http://www.d.umn.edu/~tpederse/nsp.html
There are several new features included, among them:
1) improved stoplist handling that allows stop listed words to be defined
via regular expressions,
2) new utility tools for finding kth order co-occurrences in corpora,
3) several new 2-dimensional (bigram) tests,
4) ... and at long last, a 3-dimensional (trigram) test (the log
likelihood ratio).
We've also revamped and expanded the documentation. A more detailed
changelog is available at
http://www.d.umn.edu/~tpederse/Code/ChangeLog.nsp-v0.53.txt
and the new README is at:
http://www.d.umn.edu/~tpederse/Code/Readme.nsp-v0.53.txt
We'd be pleased if you'd check it out and let us know what you think.
Cordially,
Ted
More information about the Corpora
mailing list