[Corpora-List] Intercoder agreement

Thu May 10 15:50:04 UTC 2007

In the recently completed SemEval, several tasks involved some aspect of 
granularity.  Specifically, an effort is made to group "coarse" senses. 
  Clearly, two people (or automatic methods) may group senses 
differently.  One question is how to measure agreement between any two 
groupings.

Stated more formally, for a word with n senses, each coder creates 1 to 
n buckets.  How do we measure agreement in their classifications?  One 
real example involves 5 senses, with coder A making the grouping {{1, 2, 
3}, {4}, {5}} and coder B making the grouping {{1, 2, 3, 4}, {5}}.  One 
way of measuring might be to identify the number of transformations 
necessary to make the groups identical (in this case, only one is 
necessary).  This is slightly unsatisfying.

Another way of examining the groupings is through intertagger agreement 
(ITA) on the end groups, e.g., the 90% solution of OntoNotes.  In this 
case, if ITA is less than 90%, senses are regrouped until it is greater 
than 90%.  This method may mask the semantic coherence of a set of senses.

Any suggestions are welcome.

	Ken
-- 
Ken Litkowski                     TEL.: 301-482-0237
CL Research                       EMAIL: ken at clres.com
9208 Gue Road
Damascus, MD 20872-1025 USA       Home Page: http://www.clres.com