[Corpora-List] question about Wordsmith tools (log-likelihood)

Luciana Diniz esllsdx at langate.gsu.edu
Wed Sep 20 20:50:07 UTC 2006


Hello!

I'm trying to make sense of the log likelihood formula (in the Wordsmith
Tools manual), and I'm not sure what "d" means in:

"d := frequency of pairs involving neither w1 nor w2"

Does it mean the frequency of the all possible collocates (with span
1:1) minus the frequency of the word 1 (isolated frequency) minus the
frequency of word 2 (isolated frequency)?
If this is the case, would "d" be very close to the total number of
words in the corpus?

Also, if this is the case, what if I choose a different span? Would this
change the value of "d"?

I'm very confused and I'd really appreciate it if somebody could help me
:) 

Thank you!
Luciana.



More information about the Corpora mailing list