[Corpora-List] Second Announcement of Data Release and Call for Participation

Thu May 7 13:45:31 UTC 2009

Second Announcement of Data Release and Call for Participation

Third i2b2 Shared-Task and Workshop
Challenges in Natural Language Processing for Clinical Data
Medication Extraction Challenge

Data Release: 1 June, 2009
Evaluation: August, 2009
Paper Submission: 1 September, 2009
Workshop: November, 2009 in San Francisco, CA
URL: i2b2.org/NLP

********* Register NOW at i2b2.org/NLP **********

Organizer: Informatics for Integrating Biology and the Bedside, i2b2, a
National Center for Biomedical Computing

Medication extraction challenge aims to encourage development of natural
language processing systems for the extraction of medication-related
information from narrative patient records. Information to be targeted
includes medications, dosages, modes of administration, frequency of
administration, and the reason for administration. In order to encourage
the development of semi- and un-supervised systems for medication
extraction, the development data for the medication extraction challenge
will be distributed unannotated. Participants will be allowed to create
their own annotations. For this purpose, annotation guidelines and sample
annotated records will be provided.

The challenge opens to registration on April 1, 2009. Development data for 
the challenge will be released in June. Evaluation will be on test data, 
which is to be released in August. The results of the challenge will be 
presented at the workshop organized by i2b2.

Data for the medication extraction challenge will be released under a Data
Use Agreement. Obtaining the data requires completing a registration and
signing the Data Use Agreement. Downloading the data implies commitment on
the part of the downloading team to participate in the medication
extraction challenge. Data can be kept and used for research purposes
beyond the duration of the challenge.

Evaluation Dates, File Formats, and Evaluation Metrics.

The medication extraction challenge is inspired by the Question Answering
track of Text Retrieval Evaluation Conference (TREC) of NIST. Following
the standards of NIST, evaluation will be on the test data and evaluation
metrics will resemble those of NIST. Participating teams are asked to stop
development as soon as they download the test data. Each team is allowed
to upload (through this website) up to three system runs. System output is
expected in the form of standoff annotations, following the exact format
of the ground truth annotations to be provided by i2b2.

Test data will be annotated by the challenge participants. After uploading
their system outputs to the i2b2 website, each team will be asked to
annotate 10 records/person. Multiple annotations for each record will be
obtained before finalizing the ground truth. Downloading the training data
constitutes commitment on the part of the challenge participants to
annotate 10 records/person from the test data.

Participants are asked to submit a short paper describing their system and
analyzing their performance. Papers should be in AMIA style and should not
exceed five pages. Authors of top performing systems and of particularly
novel approaches will be invited to present or demo their systems at the
workshop. A journal special issue will be organized for a subset of the
top ten systems.

Tentative Schedule
April 1, 2009    Registration Open
June, 2009    Development Data Release
August, 2009    Test Data Release at 9am EST
October, 2009    Notification of Results to Each Participant
November, 2009    Workshop

Organizing Committee:

Ozlem Uzuner, Chair, SUNY at Albany
Middle East Technical University Northern Cyprus Campus
Imre Solti, University of Washington
Peter Szolovits, MIT CSAIL
Isaac Kohane, Partners HealthCare

Please see the FAQs and announcements for more information. Questions on
the challenge can be addressed to Ozlem Uzuner, i2b2nlp at albany.edu.

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora