[Corpora-List] Co-occurrence stats from BNC

MCUSSHS harold.somers at manchester.ac.uk
Fri Mar 17 10:43:47 UTC 2006


Sorry if this is a dumb question: for a student project, we would like
to get the following stats based on the BNC:
(1) frequency (or probability) of all trigrams
(2) co-occurrence stats for all word pairs (NOT bigrams, note) based on
co-occurrence within the same sentence

I assume that this is easy to compute, though time-consuming; and of
course I understand that the data will be relatively sparse.

So my question is, is this data available somewhere, e.g. someone has
already done it; OR: what is the easiest ay to do it?

Harold Somers  



More information about the Corpora mailing list