[Corpora-List] Call for Participation: Shared Task DDIExtractor2011: TRAINING DATA AVAILABLE!!!

Isabel Segura isegura at inf.uc3m.es
Fri Apr 1 16:46:33 UTC 2011


Apologies for multiple postings.
____________________________

CALL FOR PARTICIPATION: Shared Task on Drug-Drug Interaction
Extraction (DDIExtractor2011)

 (http://labda.inf.uc3m.es/DDIExtraction2011/)


** TRAINING DATA AVAILABLE NOW at
http://labda.inf.uc3m.es/DDIExtraction2011/dataset.html


_____________________________

== THE TASK ==

A drug-drug interaction (DDI) occurs when one drug influences the
level or activity of another, for example, raising its blood drug
levels and possibly intensifying its side effects or decreasing drug
concentrations and thereby reducing its effectiveness. The detection
of DDI is an important research area in patient safety since these
interactions can become very dangerous and increase health care costs.
Although there are different databases supporting health care
professionals in the detection of DDI, these databases are rarely
complete, since their update periods can reach three years [1]. Drug
interactions are frequently reported in journals of clinical
pharmacology and technical reports, making medical literature the most
effective source for the detection of DDI. Thus, the management of DDI
is a critical issue due to the overwhelming amount of information
available on them [2].
Information Extraction (IE) can be of great benefit in the
pharmaceutical industry allowing identification and extraction of
relevant information on DDI and providing an interesting way of
reducing the time spent by health care professionals on reviewing the
literature. Moreover, the development of tools for automatically
extracting DDI is essential for improving and updating the drug
knowledge databases. Most investigation has centered around biological
relationships (genetic and protein interactions (PPI)) due mainly to
the availability of annotated corpora in the biological domain, a fact
that facilitates the evaluation of approaches. Few approaches have
focused on the extraction of DDIs.

In the last decade, Information Extraction techniques have received an
increasing interest as suitable solution to extract and analyse the
huge volume of published documents in the biological domain. The
BioCreAtIvE (http://www.biocreative.org/) (Critical Assessment of
Information Extraction systems in Biology) challenges have played a
key role in improving the Information Extraction techniques applied to
the biological domain by providing a common benchmark for evaluating
these techniques. Recently, medical and pharmacological domain also
benefit from the application of such technology. However, there is no
forum to allow the comparison among the various techniques. Likewise
the BioCreative challenge evaluation has devoted to provide a common
framework for evaluation of text mining driving progress in text
mining techniques applied to the biological domain, this task is
intended to provide a benchmarck forum for comparasing the latest
advances of these techniques applied to the extraction of drug-drug
interactions that will enable researchers to compare their algorithms
applied to the extraction of drug-drug interactions. We think that
this new task is very appealing to  groups studying Protein-Protein
Interaction (PPI) extraction because they could adapt their systems to
extract drug-drug interactions.

We have created a specific corpus, the corpus DrugDDI, consisting of a
collection of biomedical texts annotated with drug-drug interactions
(DDI). The main value of the DrugDDI corpus comes from its annotation
since all the documents have been marked-up with drug-drug
interactions by a pharmacist. Although there may be relations between
drugs in different sentences, they have not been annotated in the
DrugDDI corpus. We provide our corpus in two different fomats: (1) a
format based on the information provided by the UMLS MetaMap tool
(MMTx) and (2) the unified XML format for Protein-Protein Interaction
Extraction proposed in [1]. Hence, participants should choose between
these two formats depending on their preferences, since some systems
may have no use for MMTx information.

Each team participating in this task will initially have access only
to the training data. Later, the teams will have access to unlabeled
testing data (that is, there will be shallow syntactic and semantic
information provided by MetaMap, but drug-drug interactions are not
labelled). The teams will enter their algorithms' guesses for each
pair of drugs in the same sentence. The training dataset contains a
total of 2809 sentences than contain two or more drugs, although only
1532 contain at least one interaction. A total of 2421 drug-drug
interactions have been identified in the training dataset. The test
dataset necessary for the evaluation part of the task will be
available in May 30th. When DDIExtraction2011 is over, the labels for
the testing data will be released to the public. Algorithms will be
ranked according to their F-scores.

Participants are encouraged to submit a paper to the workshop in order
to describe their systems for DDI extraction to the audience in a
regular workshop session together with special invited speakers.
Submitted papers will be reviewed by our program committee.

** More information is available at DDIExtraction2011 website
(http://labda.inf.uc3m.es/DDIExtraction2011)


== THE DATES ==

Training dataset: April 1st
Test dataset and Evaluation starts: May 30th
Evaluation ends: June 6th
Task coordinators send evaluation: June 15th
Paper submission: June 26th
Notification of acceptance for papers: July 15th
Final Camera Ready paper due: July 30th
Worshop day: September, 7th afternoon 2011

== TASK ORGANIZERS ==

- Isabel Segura-Bedmar, Universidad Carlos III of Madrid, Spain.
- Paloma Martínez, Universidad Carlos III of Madrid, Spain.
- Daniel Sánchez Cisneros, Universidad Carlos III of Madrid, Spain.

http://labda.inf.uc3m.es/


Looking forward to your submissions!

The DDIExtractor2011 Team


[1] S. Pyysalo, A. Airola, J. Heimonen, J. Bjorne, F. Ginter, and T. Salakoski.
Comparative analysis of protein-protein interaction corpora. BMC Bioinformatics.
Special issue, 9(Suppl 3):S6, 2008.

--
Isabel Segura Bedmar
Despacho 2.2.A.10
Telf: 91 624 99 88
Departamento de Informática
Universidad Carlos III de Madrid,
Laboratory for Advanced Database (LABDA)
http://labda.inf.uc3m.es/doku.php?id=en:inicio

http://labda.inf.uc3m.es/doku.php?id=en:labda_personal:personal_isegura
http://www.inf.uc3m.es/component/comprofiler/userprofile/isegura

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list