<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

<HTML>

<HEAD>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1">

<META NAME="Generator" CONTENT="MS Exchange Server version 6.5.7226.0">

<TITLE>Call for papers: Workshop on language processing in the biomedical domain</TITLE>

</HEAD>

<BODY>

<!-- Converted from text/plain format -->


<P><FONT SIZE=2>Hi, CORPORA list people,<BR>

<BR>

The list of topics for this meeting includes a number of areas of<BR>

interest to corpus linguists, including:<BR>

<BR>

- corpus construction efforts<BR>

- evaluation and testing of systems<BR>

- test suites for biomedical language processing systems<BR>

<BR>

It's being held in Detroit, MI, the day before the ACL Tutorial Day.<BR>

<BR>

Workshop title: Linking Biological Literature, Ontologies and<BR>

Databases: Mining Biological Semantics<BR>

<BR>

Description<BR>

<BR>

This workshop will bring researchers in natural language processing in<BR>

the bioinformatics and biomedical domains together with scientists in<BR>

bioinformatics and biology.  It follows successful workshops on the<BR>

topic at ACL 2002, 2003, and 2004, and NAACL 2004, as well as related<BR>

meetings at PSB (Pacific Symposium on Biocomputing) and ISMB<BR>

(Intelligent Systems in Molecular Biology).  This will be a joint<BR>

workshop with the ISMB SIG on text mining for biology, and it will be<BR>

colocated with the ISCB annual meeting in Detroit, MI, on June 24,<BR>

2005.<BR>

<BR>

Recent years have seen an interesting confluence between the worlds of<BR>

bioinformatics and natural language processing.  Molecular biologists,<BR>

confronted with new high-throughput sources of data, have recognized<BR>

that language processing can provide them with tools for handling a<BR>

flood of data that is unprecedented in the history of the life<BR>

sciences.  The natural language processing community, in turn, has<BR>

become aware of the resources that the computational bioscience<BR>

community has made available, and there has been growing interest in<BR>

applying natural language processing techniques to mine the biological<BR>

literature to support complex applications in the biological domain,<BR>

ranging from identifying relevant literature (information retrieval)<BR>

to extraction of experimental finding to populate biological knowledge<BR>

bases to summarization, to present key facts to biologists in succinct<BR>

form.<BR>

<BR>

A number of successful conferences and workshops have resulted, with<BR>

significant progress in the areas of entity identification, concept<BR>

normalization, and system evaluation coming through competitions like<BR>

the KDD Cup, BioCreAtIvE and through shared resources like the Genia<BR>

corpus.<BR>

<BR>

This workshop will continue the interaction between these communities.<BR>

Papers on the role of ontologies in understanding biomedical texts<BR>

and on evaluation and testing of systems built for these domains are<BR>

especially invited, but submissions on all topics related to natural<BR>

language processing in the bioinformatics, biomedicine, and molecular<BR>

biology are welcome, including:<BR>

<BR>

- the role of ontologies and knowledge bases in understanding biomedical texts<BR>

- knowledge representation<BR>

- evaluation and testing of systems<BR>

- test suites for biomedical language processing systems<BR>

- entity identification and normalization<BR>

- information extraction<BR>

- information retrieval<BR>

- corpus construction efforts<BR>

- coreference and anaphora resolution<BR>

- visualization<BR>

<BR>

Target audience and expected number of participants<BR>

<BR>

The target audience is researchers in natural language processing in<BR>

the molecular biology, medical, and associated domains.  We expect<BR>

these researchers to come from the fields of linguistics, computer<BR>

science, bioinformatics, medical informatics, and molecular biology.<BR>

<BR>

The expected number of participants is 70.<BR>

<BR>

Workshop length<BR>

<BR>

The workshop length will be one day.<BR>

<BR>

Organizing committee<BR>

<BR>

Kevin Bretonnel Cohen leads the Biomedical Text Mining Group at the<BR>

University of Colorado's Center for Computational Pharmacology.  He is<BR>

the author of a number of papers and one book chapter on natural<BR>

language processing in the biomedical domain.  Current projects in the<BR>

Center for Computational Pharmacology include an NIH R-01-funded<BR>

project to build a molecular biology knowledgebase using text data<BR>

mining; an information extraction project targeting assertions about<BR>

translocation of proteins; and ongoing research in software testing<BR>

techniques for natural language processing software.<BR>

<BR>

Lynette Hirschman is Chief Scientist for the Information Technology<BR>

Center at MITRE in Bedford, MA, where she leads MITRE's efforts in<BR>

bioinformatics and text mining for biology.  Her group has been<BR>

responsible for the 2002 KDD Challenge Cup Evaluation Task 1:<BR>

Information Extraction for Biomedical Articles and the 2004<BR>

BioCreAtIvE challenge evaluation in biomedical entity extraction (in<BR>

conjunction with Alfonso Valencia and Christian Blaschke at the Centro<BR>

Nacional de Biotechnología).  Recent research projects have included<BR>

the use of curated biological databases for noisy training data to<BR>

train statistical entity extraction systems, and tools to aid curators<BR>

for biological databases. She is the co-organizer of the ISMB Special<BR>

Interest Group on Text Mining for Biology (with Alfonso Valencia) and<BR>

is currently serving on the Gene Ontology Consortium Advisory<BR>

Committee.<BR>

<BR>

Christian Blaschke is the project leader for text mining and<BR>

information extraction systems at bioalma in Madrid.  He was the first<BR>

author of the earliest paper on rule-based information extraction from<BR>

molecular biology literature.  His recent projects have included being<BR>

an organizer of the first BioCreative (Critical Assessment of<BR>

Information Extraction Systems in Biology) competition on biological<BR>

text data mining.  His current work involves leading the development<BR>

of text mining systems for pharmaceutical and biotechnology companies.<BR>

<BR>

Hagit Shatkay is an assistant professor in the School of Computing at<BR>

Queen's University in Kingston, Ontario.  Her research is in the area<BR>

of machine learning as it applies to biomedical data mining.  She is<BR>

an active member of the biomedical text-mining research community,<BR>

where her work focuses on biomedical information retrieval.  She has<BR>

presented tutorials on biomedical literature mining at the Pacific<BR>

Symposium on Biocomputing, the Bioinformatics Summer School, and the<BR>

International Conference on Intelligent Systems for Molecular Biology,<BR>

and has recently established BLIMP, a web-based forum for Biomedical<BR>

Literature Mining Publications.  Prior to joining Queen's University,<BR>

she was a researcher with the Informatics Research group at<BR>

Celera/Applied Biosystems, following a postdoctoral fellowship at the<BR>

National Center for Biotechnology Information.  She holds a PhD in<BR>

computer science from Brown University, and an MSc and BSc in computer<BR>

science from the Hebrew University in Jerusalem.<BR>

<BR>

Program Committee<BR>

<BR>

We have assembled a strong set of people from academia, industry, and<BR>

government in the US, Europe, and Japan.  The program committee<BR>

includes researchers with world-class reputations in this field.<BR>

<BR>

Sophia Ananiadou, University of Salford<BR>

Lan Aronson, NLM<BR>

Breck Baldwin, Alias-i Inc.<BR>

Olivier Bodenreider, NLM<BR>

Shannon Bradshaw, University of Iowa<BR>

Bob Carpenter, Alias-i Inc.<BR>

Jeff Chang, Duke Univeristy<BR>

Aaron Cohen, Oregon Health Sciences University<BR>

Nigel Collier, National Institute of Informatics, Japan<BR>

Lynne Fox, University of Colorado Health Sciences Center<BR>

Bob Futrelle, Northeastern University<BR>

Henk Harkema, University of Sheffield<BR>

Marti Hearst, University of California at Berkeley<BR>

Larry Hunter, University of Colorado School of Medicine<BR>

Steve Johnson, Columbia University<BR>

Marc Light, University of Iowa<BR>

Hongfang Liu, University of Maryland at Baltimore County<BR>

Alex Morgan, MITRE<BR>

James Pustejovsky, Brandeis University<BR>

Tom Rindflesch, NLM<BR>

Andrey Rzhetsky, Columbia University<BR>

Jasmin Saric, EML Research gGmbH<BR>

Lorrie Tanabe, NCBI, NLM<BR>

Jun-ichi Tsujii, University of Tokyo<BR>

Alfonso Valencia, Universidad Autonoma de Madrid<BR>

Karin Verspoor, Los Alamos National Labs<BR>

John Wilbur, NCBI, NLM<BR>

Hong Yu, Columbia University<BR>

<BR>

--<BR>

K. B. Cohen<BR>

Biomedical Text Mining Group Lead<BR>

Center for Computational Pharmacology<BR>

303-916-2417 (cell) 303-377-9194 (home)<BR>

<A HREF="http://compbio.uchsc.edu/Hunter_lab/Cohen">http://compbio.uchsc.edu/Hunter_lab/Cohen</A><BR>

<A HREF="http://www.ling.ohio-state.edu/~kcohen/">http://www.ling.ohio-state.edu/~kcohen/</A><BR>

</FONT>

</P>


</BODY>

</HTML>