[Corpora-List] WSD / # WordNet senses / Mechanical Turk

Tue Jul 16 00:59:48 UTC 2013

On 7/15/2013 6:15 PM, Kilian Evang wrote:
> Off the top of my head, here's two relevant studies on inter-rater
> reliability for WSD, one for the case of expert annotators and one for
> the case of non-experts:
>
> http://link.springer.com/article/10.1023/A:1002693207386#page-1

 From the abstract at the pointy end of this pointer:
> The exercise identifies the state-of-the-art for fine-grained word sense
> disambiguation, where training data is available, as 74–78% correct, with
> a number of algorithms approaching this level of performance. For systems
> that did not assume the availability of training data, performance was
> markedly lower and also more variable. Human inter-tagger agreement was
> high, with the gold standard taggings being around 95% replicable.

Implication:  For a 300-word page of text, a state-of-the-art program
would have about 75 errors.  That would be an average of two errors
for 8-word sentences, or five errors for 20-word sentences.

For the "gold" standard, there would still be 15 errors in a 300-word
page.  Miss Elliott, my high-school English teacher, wouldn't give
anyone a gold star for 15 errors per page.

John

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora