[Corpora-List] N-gram string extraction

Christer Johansson christer.johansson at lili.uib.no
Tue Aug 27 16:31:48 UTC 2002


May I suggest having a look in Ken Church's introduction to Ngrams. You
will find it using the obvious key words on e.g. google (Church Ngrams).
Filename: kwc-ngrams.pdf

  The task can be done by simple combinations of paste, tail, sort, uniq
-c, and filtering programs written in awk. Hard to make anything better
than that.



More information about the Corpora mailing list