Corpora: Collaborative effort
Bob Krovetz
krovetz at research.nj.nec.com
Tue Jun 13 01:59:37 UTC 2000
Robert Luk wrote:
>Consider that one has 6 sense tags and the other also has 6 sense tags for the same
>word in a sentence, assuming that they use the same set of sense tags
>(although not likely). The likelihood that the two tagging
>algorithms agreed by chance (independently) is 6 x 1/6 x 1/6. So, the
>above seems to be true if there are 2 sense tags for the word:
>
> 2 x 1/2 x 1/2.
>
>Is this correct?
In the case of Semcor and DSO, the sense inventory was the same (WordNet).
The rate of agreement I mentioned was the agreement we would get by
tagging all instances with the most frequent sense for the word in the corpus.
I don't see why you say it is not likely that they will use the same set of
sense tags. How can we make meaningful comparisions between word-sense
tagging systems without using the same word sense inventory? That was
the purpose of the SENSEVAL competition.
Bob
More information about the Corpora
mailing list