[Corpora-List] Intercoder agreement

Thu May 10 16:30:52 UTC 2007

Hi,

Rebecca Passonneau proposed (in a 1997 paper) a kappa statistic based on the 
model-theoretic scoring scheme of Vilain et al., which I think would be 
useful for this kind of thing. In this case, you have 2 'links' shared 
(1-2,2-3), one only for coder A (3-4) and one that no one wants (4-5),
with marginals of 3/4 links for A, and 2/4 links for B.
Expected agreement would be 0.5 (2/4*3/4+2/4*1/4), observed agreement would be 
0.75 (2/4+1/4), which gives kappa=(0.75-0.5)/(1-0.5)=0.5.
Vilain et al's counting scheme gives you these 'link' counts for more 
contrived groupings.

Best,
Yannick
> In the recently completed SemEval, several tasks involved some aspect of
> granularity.  Specifically, an effort is made to group "coarse" senses.
>   Clearly, two people (or automatic methods) may group senses
> differently.  One question is how to measure agreement between any two
> groupings.
>
> Stated more formally, for a word with n senses, each coder creates 1 to
> n buckets.  How do we measure agreement in their classifications?  One
> real example involves 5 senses, with coder A making the grouping {{1, 2,
> 3}, {4}, {5}} and coder B making the grouping {{1, 2, 3, 4}, {5}}.  One
> way of measuring might be to identify the number of transformations
> necessary to make the groups identical (in this case, only one is
> necessary).  This is slightly unsatisfying.

-- 
Yannick Versley
Seminar für Sprachwissenschaft, Abt. Computerlinguistik
Wilhelmstr. 19, 72074 Tübingen
Tel.: (07071) 29 77352