[Corpora-List] Dice coefficient

Bengt Dahlqvist bengt.dahlqvist at ling.uu.se
Mon Apr 24 13:17:24 UTC 2006


At 09:50 2006-04-19, Markus Saers wrote:
>The only definition of the Dice coefficient that I have seen looks like this:
>
>Dice = 2 * p(ws, wt) / ( p(ws) + p(wt) )

The Dice index can also be computed from a 2x2 contingency table:
       x=1 0
y=1     a b
   0     c d

Here the Dice (1945) is defined as = 2*a / (2*a + b + c)
This computation can be easier to perform in certain cases.

In the same manner, the Jaccard (1908) index is defined as
Jaccard = a/(a+b+c)

and e.g. the Ochiai (1957) index = a / (sqrt(a+b) * sqrt(a+c))

The literature gives a wealth of other indices as well.
--
/Bengt Dahlqvist



More information about the Corpora mailing list