May I suggest having a look in Ken Church's introduction to Ngrams. You will find it using the obvious key words on e.g. google (Church Ngrams). Filename: kwc-ngrams.pdf The task can be done by simple combinations of paste, tail, sort, uniq -c, and filtering programs written in awk. Hard to make anything better than that.