Corpora: Collaborative effort

Bob Krovetz krovetz at research.nj.nec.com
Tue Jun 13 01:59:37 UTC 2000


Robert Luk wrote:

>Consider that one has 6 sense tags and the other also has 6 sense tags for the same
>word in a sentence, assuming that they use the same set of sense tags
>(although not likely). The likelihood that the two tagging
>algorithms agreed by chance (independently) is 6 x 1/6 x 1/6. So, the
>above seems to be true if there are 2 sense tags for the word:
>
>	2 x 1/2 x 1/2.
>
>Is this correct?

In the case of Semcor and DSO, the sense inventory was the same (WordNet).
The rate of agreement I mentioned was the agreement we would get by
tagging all instances with the most frequent sense for the word in the corpus.

I don't see why you say it is not likely that they will use the same set of
sense tags.  How can we make meaningful comparisions between word-sense
tagging systems without using the same word sense inventory?  That was
the purpose of the SENSEVAL competition.

Bob



More information about the Corpora mailing list