[Corpora-List] WSD / # WordNet senses / Mechanical Turk
John F Sowa
sowa at bestweb.net
Tue Jul 16 00:59:48 UTC 2013
On 7/15/2013 6:15 PM, Kilian Evang wrote:
> Off the top of my head, here's two relevant studies on inter-rater
> reliability for WSD, one for the case of expert annotators and one for
> the case of non-experts:
>
> http://link.springer.com/article/10.1023/A:1002693207386#page-1
From the abstract at the pointy end of this pointer:
> The exercise identifies the state-of-the-art for fine-grained word sense
> disambiguation, where training data is available, as 74–78% correct, with
> a number of algorithms approaching this level of performance. For systems
> that did not assume the availability of training data, performance was
> markedly lower and also more variable. Human inter-tagger agreement was
> high, with the gold standard taggings being around 95% replicable.
Implication: For a 300-word page of text, a state-of-the-art program
would have about 75 errors. That would be an average of two errors
for 8-word sentences, or five errors for 20-word sentences.
For the "gold" standard, there would still be 15 errors in a 300-word
page. Miss Elliott, my high-school English teacher, wouldn't give
anyone a gold star for 15 errors per page.
John
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list