[Corpora-List] CoNLL 2012 Shared Task -- Call for Participation

german rigau german.rigau at ehu.es
Thu Dec 29 19:37:39 UTC 2011


Hola Egoitz,

Cómo va todo? Espero que bien ... nosotros ya estamos en Huelva, después de
cebarnos bastante en Barcelona ... ;-P

Cómo han ido los experimentos? Tenemos algún resultado?

Y he vuelto a ver lo del shared task de CONLL ... porque en el fondo, la
correferencia, igual podría modelarse con algo parecido al modelo que
tienes actualmente, no? Lista de posibles candidatos, etc.

Hasta pronto,

German

On Thu, Dec 29, 2011 at 12:39 AM, Erik Tjong Kim Sang <erikt at xs4all.nl>wrote:

> ------------------------------**------------------------------**
> -----------
> CoNLL 2012 Shared Task -- Call for Participation
> ==============================**==================
>
> Modeling Multilingual Unrestricted Coreference in OntoNotes
> ------------------------------**-----------------------------
>
> CoNLL-2012, to be held jointly with EMNLP in conjunction with ACL (Jeju,
> Korea, 12-14 July 2012), will continue the tradition of including a shared
> task for natural language learning systems.  The 2012 shared task will
> target the modeling of coreference resolution for multiple languages.  The
> importance of the latter for the entity/event detection task, namely
> identifying all mentions of entities and events in text and clustering
> them into equivalence classes, has been well recognized in the natural
> language processing community. Automatic identification of coreferring
> entities and events in text has been an uphill battle for several decades,
> partly because it can require world knowledge which is not well-defined
> and partly owing to the lack of substantial annotated data.
>
> The OntoNotes project (http://www.bbn.com/ontonotes/**) -- a collaborative
> effort between BBN Technologies, University of Colorado, University of
> Southern California (ISI), University of Pennsylvania and Brandeis
> University -- created a large-scale, accurate multilingual corpus for
> general anaphoric coreference that covers entities and events not limited
> to noun phrases or a limited set of entity types. The Linguistic Data
> Consortium (LDC) has agreed to make it freely available to the research
> community. The coreference layer in OntoNotes constitutes one part of a
> multi-layer, integrated annotation of shallow semantic structure in text
> with high inter-annotator agreement. In addition to coreference, this data
> is also tagged with syntactic trees, high coverage verb and some noun
> propositions, partial verb and noun word senses, and rich set of named
> entity types.
>
> Modeling multilingual unrestricted coreference in the OntoNotes data is
> the shared task for CoNLL-2012. This is an extension of the CoNLL-2011
> shared task and would involve automatic anaphoric mention detection and
> coreference resolution across three languages -- English, Chinese and
> Arabic -- using OntoNotes v5.0 corpus, given predicted information on the
> syntax, proposition, word sense and named entity layers. The training data
> will contain both gold standard and predicted annotations, but only
> predicted annotations will be provided with the test material. The English
> and Chinese language portion comprises roughly one million words per
> language from newswire, magazine articles, broadcast news, broadcast
> conversations, web data and conversational speech. The English corpus also
> contains a further 200k of the English translation of the New Testament.
> The Arabic portion is smaller, comprising 300k of newswire articles.
>
> The evaluation will follow CoNLL-2011's strategy.  The score for each
> language will be determined by computing the unweighted average across the
> MUC, BCUBED, and CEAF metrics.  The introduction of two new languages in
> the shared task offers a unique opportunity to carry out research in new
> contexts of coreference resolution and derive more general findings, which
> go beyond the monolingual (English) setting.  Given the multilingual focus
> of this shared task, the winner will be determined by aggregating the
> scores across all languages.  Although the participants are not required
> to work with all three languages, they are strongly encouraged to work
> with at least two languages and one of them could be English.  Systems
> will be penalized with a null score for the languages that are left out.
> In addition, the review process of the shared task will favorably consider
> papers reporting experiments in a multilingual settings.
>
> Interested participants should register for the task
> (http://conll.cemantix.org/**2012/registration.html<http://conll.cemantix.org/2012/registration.html>)
> as soon as possible to
> start potential paper work for obtaining the corpora as well as be updated
> on the details of the task going forward.  More information about the task
> will soon be available under the task website:
> http://conll.cemantix.org/2012
>
>
>
> Organizers
> ----------
>
> Sameer Pradhan (Chair) Raytheon BBN Technologies, Cambridge, MA
> Alessandro Moschitti University of Trento, Italy
> Nianwen Xue, Brandeis University, Waltham, MA
>
>
>
> Advisory Committee
> ------------------
>
> Mitchell Marcus, University of Pennsylvania, Philadelphia, PA
> Martha Palmer, University of Colorado, Boulder, CO
> Lance Ramshaw, Raytheon BBN Technologies, Cambridge, MA
> Ralph Weischedel, Raytheon BBN Technologies, Cambridge, MA
>
>
>
> Contact
> -------
>
> Questions about the CoNLL-2012 shared task can be sent to
> conll-2012-st at cemantix.org
>
>
>
> Important Dates
> ---------------
>
>  December 30, 2012: Registration begins
>   January 22, 2012: Trial Data available
>  February 10, 2012: Registration deadline
>  February 19, 2012: Training and development set available
>     April 22, 2012: Test set available
>     April 26, 2012: System outputs collected
>     April 29, 2012: System results due to participants
>       May  6, 2012: System papers due
>       May 15, 2012: Reviews back to authors
>       May 22, 2012: Camera ready papers due
>
> ______________________________**_________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/**corpora<http://mailman.uib.no/options/corpora>
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/**listinfo/corpora<http://mailman.uib.no/listinfo/corpora>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111229/afb46a3b/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list