[Corpora-List] Linguistic Annotation Workshop/SIGANN : Call for Participation
Nancy Ide
ide at cs.vassar.edu
Fri Jan 25 15:59:02 UTC 2008
CALL FOR PARTICIPATION
ACL Special Interest Group on Annotation (SIGANN)
Sharable Corpus and Best Practice Guidelines Working Group Sessions
1-6PM, May 27, 2008
The Second Linguistic Annotation Workshop
Held in conjunction with LREC 2008
Marrakech, Morocco
http://verbs.colorado.edu/LAW2008/
The SIGANN Sharable Corpus and Best Practice Guidelines Working Groups
will hold a joint session at the Second Linguistic Annotation Workshop
on the afternoon of May 27, 2008, in Marrakech, Morocco. The session
will be devoted to issues surrounding the merging and harmonization of
linguistic annotations representing various phenomena that may have
been produced by different groups using different formats, and may be
based on different theoretical approaches. The discussions will use as
a point of departure linguistic annotations of a portion of the SIGANN
Sharable Corpus contributed by members of the computational
linguistics community.
We solicit contributions of manually or automatically produced
annotations of the SIGANN Sharable Corpus for any linguistic
phenomenon, including but not limited to morpho-syntax, syntax,
semantic roles, word senses, named entities, temporal elements,
events, co-reference and other discourse-level phenomena. The
annotations will be collected in early April, after which the session
organizers will coordinate an effort to merge and compare the
contributed annotations. Based on the experience of this exercise,
discussion points including examples will be drawn up for
consideration in the joint session. Issues to be considered will
include:
(1) What are the issues/problems of merging diverse annotations of
different phenomena into a single multi-layer annotation, in terms of
harmonizing different physical formats?
(2) What are the issues/problems of merging diverse annotations of
different phenomena into a single multi-layer annotation, in terms of
enabling a coherent and comprehensive linguistic description?
(3) Are there phenomena for which an attempt at compatibility/
harmonization is not desirable?
(4) What are the implications and/or suggestions of this exercise for
the development of best practice guidelines for linguistic annotation?
(5) Are there certain phenomena (e.g. segmentation into tokens,
phrases, etc.) that lend themselves more readily to the specification
of standard practices, and for which the existence of a common method
would enhance annotation interoperability?
(6) What are the good and bad consequences of introducing a
theoretical bias into the merging process? A theoretically biased
merging procedure creates essentially a new annotation that uses
previously created annotation as input in a destructive manner so that
the input annotation can not be read directly from the merged output.
Can the creation of a merged annotation that is consistent with a
theory justify making these changes? Can “errors” in input annotation
be detected in this way
Those who wish to contribute annotations and/or be involved in
discussions at the session should consult the LAW II website for
details: http://verbs.colorado.edu/LAW2008/, or contact the session
organizers.
Session organizers:
Best Practices Working Group
Nancy Ide, Vassar College (ide [at] cs.vassar.edu)
Sharable Corpus Working Group
Adam Meyers, New York University (meyers [at] cs.nyu.edu)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080125/37280e4a/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list