[Corpora-List] Announcement of Data Release and Call for Participation
nlp06 at albany.edu
nlp06 at albany.edu
Thu Feb 7 18:20:47 UTC 2008
Announcement of Data Release and Call for Participation
Second i2b2 Shared Task and Workshop
Challenges in Natural Language Processing for Clinical Data
Informatics for Integrating Biology and the Bedside, i2b2, a National
Center for Biomedical Computing, is ready to release fully de-identified
discharge records for its Second shared task!
In collaboration with State University of New York at Albany, MIT Computer
Science and Artificial Intelligence Laboratory, and Partners Healthcare
System, i2b2 is pleased to announce the Second Shared Task and Workshop on
Challenges in Natural Language Processing for Clinical Data.
Data Release and Preliminary Call for Participation.
The Second i2b2 Shared Task on Challenges in Natural Language Processing
for Clinical Data opens to preregistration on February 1, 2008. The 2008
Challenge is a multi-class, multi-label classification task focused on
obesity and its co-morbidities. The data for the challenge consists of
discharge summaries from Partners Healthcare. All records have been fully
de-identified and annotated for obesity and co-morbidities.
Training data for the 2008 Challenge will be released in installments;
first installment will be released on March 15, 2008. The rest of the
installments will follow soon after. Test data is scheduled to be
released, for only three days, and will be used for only evaluation
purposes. The results of the shared-task challenge will be presented at
the workshop organized by i2b2 (Date and location are TBA).
Data will be released under a Data Use Agreement and is to be used for the
Challenge only. Obtaining the data requires completing a preregistration
and signing the Data Use Agreement. All members of a team are requested to
sign the Data Use Agreement.
Evaluation Dates, File Formats, and Evaluation Metrics.
The Obesity challenge evaluation will be on only the test data. The
participating teams are asked to stop development as soon as they download
the test data. System output on the test data is to be returned to the
organizers for evaluation through this website within three days of test
data release. Each team is allowed to upload upto three system runs.
System output is expected only in the form of standoff annotations,
following the exact format of the ground truth annotations provided by
i2b2. We are unable to evaluate output that does not comply with this
standard. Precision, recall, and f-measure (Beta = 1) computed per class
will be used as evaluation metrics.
Participants are asked to submit a short paper describing their system and
analyzing their performance. Papers should be in AMIA style and should not
exceed five pages. Authors of top performing systems and of particularly
novel approaches will be invited to present or demo their systems at the
workshop. All submissions will be considered for publication at a special
issue of JAMIA.
Tentative Schedule February 1, 2008 Preregistration Open
March 15, 2008 Training Data Release
April 15, 2008 Commitment to Participate in Challenge
June 23, 2008 Test Data Release at 9am EST
June 25, 2008 Output Due at Midnight EST
August 1, 2008 Notification of Results to Each Participant
September 1, 2008 Final Reports Due
October 1, 2008 Invitations to Present at the Workshop
November, 2008 Workshop (Pending approval by AMIA)
Organizing Committee:
Ozlem Uzuner, SUNY at Albany, Chair
Peter Szolovits, MIT CSAIL
Isaac Kohane, Partners Healthcare
Please see the FAQs and announcements (see side bar) for more information.
Questions on the shared task should be addressed to Ozlem Uzuner,
i2b2nlp at albany.edu.
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list