[Corpora-List] Multiple category assignement

Aliabbas Petiwala aliabbasjp at gmail.com
Sun Aug 25 14:55:04 UTC 2013


For the task of building a humanly annotated corpora:

There are annotation tasks where the items belong to multiple categories
and annotators have to mark each category to which the item belongs.

e.g: the same coder c1 assigns the two categories (v1,v2) to the item '1'

task = AnnotationTask(data=[(‘c1’, ‘1’, ‘v1’),(‘c1’, ‘1’, ‘v2’),...])

So should such multiple categories be represented as bitstrings , such that
for n categories there would be a whopping 2^n assignments ? This would
surely make the inter annotator agreement (IAA) scores very low for minor
differences.

So what is the best way to compute annotation agreement for tasks that
require multiple assignment to an item? And how to represent categories for
such cases?
-- 
*

*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130825/799fe5f2/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list