[Corpora-List] SIGIR 2004 IR4QA Workshop - Call for Participation

Mon Jun 28 17:25:12 UTC 2004

                        SIGIR'04 Workshop

                     Call for Participation

         INFORMATION RETRIEVAL FOR QUESTION ANSWERING (IR4QA)

                   July 29, 2004, Sheffield, UK

For registration details see http://www.sigir.org/sigir2004

IMPORTANT: You do not have to register for the full SIGIR conference
           to register and attend the IR4QA workshop.

Open domain question answering has become a very active research area
over the past few years, due in large measure to the stimulus of the
TREC Question Answering track. This track addresses the task of
finding *answers* to natural language (NL) questions (e.g. ``How
tall is the Eiffel Tower?" ``Who is Aaron Copland?'') from large text
collections. This task stands in contrast to the more conventional IR
task of retrieving *documents* relevant to a query, where the
query may be simply a collection of keywords (e.g. ``Eiffel Tower",
``American composer, born Brooklyn NY 1900, ...'').

Finding answers requires processing texts at a level of detail that
cannot be carried out at retrieval time for very large text
collections. This limitation has led many researchers to propose,
broadly, a two stage approach to the QA task. In stage one a subset of
query-relevant texts are selected from the whole collection.  In stage
two this subset is subjected to detailed processing for answer
extraction. To date stage one has received limited explicit attention,
despite its obvious importance -- performance at stage two is bounded
by performance at stage one.  The goal of this workshop is to correct
this situation, and, hopefully, to draw attention of IR researchers to
the specific challenges raised by QA.

A straightforward approach to stage one is to employ a conventional IR
engine, using the NL question as the query and with the collection
indexed in the standard manner, to retrieve the initial set of
candidate answer bearing documents for stage two.  However, a number
of possibilities arise to optimise this set-up for QA, including:
Open domain question answering has become a very active research area
over the past few years, due in large measure to the stimulus of the
TREC Question Answering track. This track addresses the task of
finding *answers* to natural language (NL) questions (e.g. ``How
tall is the Eiffel Tower?" ``Who is Aaron Copland?'') from large text
collections. This task stands in contrast to the more conventional IR
task of retrieving *documents* relevant to a query, where the
query may be simply a collection of keywords (e.g. ``Eiffel Tower",
``American composer, born Brooklyn NY 1900, ...'').

Finding answers requires processing texts at a level of detail that
cannot be carried out at retrieval time for very large text
collections. This limitation has led many researchers to propose,
broadly, a two stage approach to the QA task. In stage one a subset of
query-relevant texts are selected from the whole collection.  In stage
two this subset is subjected to detailed processing for answer
extraction. To date stage one has received limited explicit attention,
despite its obvious importance -- performance at stage two is bounded
by performance at stage one.  The goal of this workshop is to correct
this situation, and, hopefully, to draw attention of IR researchers to
the specific challenges raised by QA.

A straightforward approach to stage one is to employ a conventional IR
engine, using the NL question as the query and with the collection
indexed in the standard manner, to retrieve the initial set of
candidate answer bearing documents for stage two.  However, a number
of possibilities arise to optimise this set-up for QA, including:
o preprocessing the question in creating the IR query;
o preprocessing the collection to identify significant information that
  can be included in the indexation for retrieval;
o adapting the similarity metric used in selecting documents;
o modifying the form of retrieval return, e.g. to deliver passages
  rather than whole documents. preprocessing the question in creating
the IR query;
o preprocessing the collection to identify significant information that
  can be included in the indexation for retrieval;
o adapting the similarity metric used in selecting documents;
o modifying the form of retrieval return, e.g. to deliver passages
  rather than whole documents.

The workshop will consist of presentations of the following
accepted papers:
o What Works Better for Question Answering: Stemming or
  Morphological Query Expansion
  Matthew W. Bilotti, Boris Katz and Jimmy Lin
o A Comparative Study on Sentence Retrieval for Definitional
  Question Answering
  Hang Cui, Min-Yen Kan, Tat-Seng Chua and Jing Xiao
o Using Pertainyms to Improve Passage Retrieval for Questions
  Requesting Information About a Location
  Mark A. Greenwood
o Minimal Span Weighting Retrieval for Question Answering
  Christof Monz
o Simple Translation Models for Passage Retrieval for QA
  Vanessa Murdock, W. Bruce Croft
o Sense-Based Blind Relevance Feedback for Question Answering
  Matteo Negri
o Exploring the Performance of Boolean Retrieval Strategies
  For Open Domain Question Answering
  Horacio Saggion, Rob Gaizauskas, Mark Hepple,
  Ian Roberts and Mark A. Greenwood
o Boosting Weak Ranking Functions to Enhance Passage Retrieval
  For Question Answering
  Nicolas Usunier, Massih R. Amini and Patrick Gallinari
o Seeking an Upper Bound to Sentence Level Retrieval in
  Question Answering
  Kieran White and Richard F. E. Sutcliffe
o Domain-Specific QA for the Construction Sector
  Zhuo Zhang, Lyne Da Sylva, Colin Davidson, Gonzalo Lizarralde
  and Jian-Yun Nie

Workshop Organizers
===================

Rob Gaizauskas          (University of Sheffield)
Mark Hepple             (University of Sheffield)
Mark Greenwood          (University of Sheffield)

Programme Committee
===================

Shannon Bradshaw        (University of Iowa)
Charles Clarke          (University of Waterloo)
Sanda Harabagiu         (University of Texas at Dallas)
Eduard Hovy             (University of Southern California)
Jimmy Lin               (Massachusetts Institute of Technology)
Christof Monz           (University of Maryland)
John Prager             (IBM)
Dragomir Radev          (University of Michigan)
Maarten de Rijke        (University of Amsterdam)
Horacio Saggion         (University of Sheffield)
Karen Sparck-Jones      (University of Cambridge)
Tomek Strzalkowski      (State University of New York, Albany)
Ellen Voorhees          (NIST)