Corpora: some language modeling questions
F. Peng
f3peng at logos.math.uwaterloo.ca
Mon Jul 30 13:54:58 UTC 2001
I have some questions about language modeling. For the class-based n-gram
models (Brown et al. 1990), the probability of word w_k given its history
w_1_(k-1) is defined as
Pr(w_k|w_1_(k-1)) = Pr(w_k|c_k)Pr(c_k|c_1_(k-1))
where w_1_(k-1) is the history of work w_k: w_1...w_(k-1),
c_k is the class which word w_k is in,
c_1_(k-1) is the class history of word w_k: c_1...c_(k-1),
Under this definition, the sum of Pr(w_k|w_1_(k-1)) over all w_k
is not equal to 1, it's Pr(c_k|c_1_(k-1)). Isn't it?
Isn't it a necessary condition for a language model to satisfy the
condtion that \sum_w Pr(w|history) = 1?
Maybe it's not a question for you, but it puzzled me for a while. thanks
in advance for help.
Best regards
Fuchun
---------------------------------------------------------
Fuchun Peng
Computer Science Department, University of Waterloo
Waterloo, Ontario, Canada, N2L 3G1
1-519-888-4567 ext 3478
f3peng at ai.uwaterloo.ca
--------------------------------------------------------
More information about the Corpora
mailing list