[Corpora-List] Co-occurrence stats from BNC
MCUSSHS
harold.somers at manchester.ac.uk
Fri Mar 17 10:43:47 UTC 2006
Sorry if this is a dumb question: for a student project, we would like
to get the following stats based on the BNC:
(1) frequency (or probability) of all trigrams
(2) co-occurrence stats for all word pairs (NOT bigrams, note) based on
co-occurrence within the same sentence
I assume that this is easy to compute, though time-consuming; and of
course I understand that the data will be relatively sparse.
So my question is, is this data available somewhere, e.g. someone has
already done it; OR: what is the easiest ay to do it?
Harold Somers
More information about the Corpora
mailing list