[Corpora-List] question about Wordsmith tools (log-likelihood)
Luciana Diniz
esllsdx at langate.gsu.edu
Wed Sep 20 20:50:07 UTC 2006
Hello!
I'm trying to make sense of the log likelihood formula (in the Wordsmith
Tools manual), and I'm not sure what "d" means in:
"d := frequency of pairs involving neither w1 nor w2"
Does it mean the frequency of the all possible collocates (with span
1:1) minus the frequency of the word 1 (isolated frequency) minus the
frequency of word 2 (isolated frequency)?
If this is the case, would "d" be very close to the total number of
words in the corpus?
Also, if this is the case, what if I choose a different span? Would this
change the value of "d"?
I'm very confused and I'd really appreciate it if somebody could help me
:)
Thank you!
Luciana.
More information about the Corpora
mailing list