[Corpora-List] Linguistic Annotation Workshop/SIGANN : Call for Participation

Nancy Ide ide at cs.vassar.edu
Fri Jan 25 15:59:02 UTC 2008


CALL FOR PARTICIPATION

ACL Special Interest Group on Annotation (SIGANN)

Sharable Corpus and Best Practice Guidelines Working Group Sessions

1-6PM, May 27, 2008


The Second Linguistic Annotation Workshop

Held in conjunction with LREC 2008

Marrakech, Morocco


http://verbs.colorado.edu/LAW2008/


The SIGANN Sharable Corpus and Best Practice Guidelines Working Groups  
will hold a joint session at the Second Linguistic Annotation Workshop  
on the afternoon of May 27, 2008, in Marrakech, Morocco. The session  
will be devoted to issues surrounding the merging and harmonization of  
linguistic annotations representing various phenomena that may have  
been produced by different groups using different formats, and may be  
based on different theoretical approaches. The discussions will use as  
a point of departure linguistic annotations of a portion of the SIGANN  
Sharable Corpus contributed by members of the computational  
linguistics community.


We solicit contributions of manually or automatically produced  
annotations of the SIGANN Sharable Corpus for any linguistic  
phenomenon, including but not limited to morpho-syntax, syntax,  
semantic roles, word senses, named entities, temporal elements,  
events, co-reference and other discourse-level phenomena. The  
annotations will be collected in early April, after which the session  
organizers will coordinate an effort to merge and compare the  
contributed annotations. Based on the experience of this exercise,  
discussion points including examples will be drawn up for  
consideration in the joint session. Issues to be considered will  
include:


(1)  What are the issues/problems of merging diverse annotations of  
different phenomena into a single multi-layer annotation, in terms of  
harmonizing different physical formats?

(2)  What are the issues/problems of merging diverse annotations of  
different phenomena into a single multi-layer annotation, in terms of  
enabling a coherent and comprehensive linguistic description?

(3)  Are there phenomena for which an attempt at compatibility/ 
harmonization is not desirable?

(4)  What are the implications and/or suggestions of this exercise for  
the development of best practice guidelines for linguistic annotation?

(5)  Are there certain phenomena (e.g. segmentation into tokens,  
phrases, etc.) that lend themselves more readily to the specification  
of standard practices, and for which the existence of a common method  
would enhance annotation interoperability?

(6)  What are the good and bad consequences of introducing a  
theoretical bias into the merging process? A theoretically biased  
merging procedure creates essentially a new annotation that uses  
previously created annotation as input in a destructive manner so that  
the input annotation can not be read directly from the merged output.  
Can the creation of a merged annotation that is consistent with a  
theory justify making these changes?  Can “errors” in input annotation  
be detected in this way


Those who wish to contribute annotations and/or be involved in  
discussions at the session should consult the LAW II website for  
details: http://verbs.colorado.edu/LAW2008/, or contact the session  
organizers.


Session organizers:

Best Practices Working Group

Nancy Ide, Vassar College (ide [at] cs.vassar.edu)

Sharable Corpus Working Group

Adam Meyers, New York University (meyers [at] cs.nyu.edu)

  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080125/37280e4a/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list