[Corpora-List] ACL04 Workshop: Question Answering in Restricted Domains

Diego Molla diego at ics.mq.edu.au
Mon Mar 8 00:23:24 UTC 2004


                  LAST CALL FOR PAPERS

ACL04 WORKSHOP ON QUESTION ANSWERING IN RESTRICTED DOMAINS
            Barcelona, Spain, 25-26 July 2004
            Submission deadline: 15 March 2004
    http://www.clt.mq.edu.au/Events/Conferences/acl04qa/

Much of the current research in question answering systems is driven
by programs such as AQUAINT and evaluation exercises such as TREC,
NTCIR and CLEF, all of which focus on open-domain question
answering. The availability of large volumes of data (e.g. documents
extracted from the World Wide Web) has prompted the development of
systems that focus on shallow text processing.

But there are many document sets in restricted domains that are
potentially valuable as a source for question answering systems. For
example, the documentation pages of Unix and Linux systems would make
an ideal corpus for QA systems targeted at users that want to know how
to use these operating systems. There is a wealth of information in
other technical documentation such as software manuals, car
maintenance manuals, and encyclopediae of specific areas such as
medicine. Users interested in these specific areas would benefit from
QA systems targeted to their areas of interest.

Restricted domains typically have limited data available and therefore
conventional techniques based on data redundancy can simply not be
applied in an effective way. The scarcity of data available seems to
prompt for a more targeted, NLP-intensive approach to QA. The use of
additional corpora such as the WWW raises a number of interesting
questions.  For instance, will these corpora help or obstruct the
proper functioning of an NLP-intensive approach to QA? And, how do we
find good pockets of information that are appropriate to the chosen
domains?

On the other hand, restricted domains (e.g. law, medicine) have
specific stylistic conventions. Often these domains use terminology
that is not stored in conventional lexica. Consequently NLP approaches
devised for open-domain systems may under-perform on these specific
domains, thus raising the question of how portable these systems can
be.

In this workshop we aim at answering some of the following questions:

* Are open-domain question answering techniques appropriate for QA in
   restricted domains?

* Can we use generic large corpora and/or the WWW? How can we identify
   specific pockets of information in these generic corpora?

* How can we use specific sources such as the CIA factbook, acronym
   lists, e-commerce sites (e.g. e-bay), and specialized glossaries and
   encyclopedia? How can we discover new specific sources?

* What types of question-answering techniques are best for what types
   of restricted domains?

* Is it easy/possible/worthwhile to develop domain-independent QA
   systems for restricted domains? What would be the cost of porting a
   QA system to a specific domain?

* Are restricted domains more suitable than open domains to drive
   research in NLP?

* Is evaluation of restricted-domain QA systems different than that of
   open-domain QA systems?

We welcome papers that address any of the above questions or that
focus on any of the following topics:

* Comparison between open-domain and restricted-domain QA

* Characterisation of the types of restricted domains and the
   technology required for QA on those domains

* Methodologies and/or tools for restricted-domain QA

* Description of specific restricted-domain QA systems

* Development of modules (e.g. document preselection, NE extraction,
   terminology extraction) for use in restricted-domain QA systems

* Portability of QA systems between different restricted domains

* Evaluation of restricted-domain QA systems


SUBMISSION PROCEDURE

Authors should submit full papers of maximum 8 pages, including
references and figures, following the main conference ACL style format
(http://www.acl2004.org/aclstyles/style.html). The review will not be
blind. Submissions must be in PS or PDF format and they should be sent
to diego at ics.mq.edu.au

PROGRAM COMMITTEE

Organizers:
-----------

Diego Mollá             Macquarie University, Australia
José Luis Vicedo        Alicante University, Spain

Committee:
----------

In alphabetical order by first name:

Anselmo Peñas           UNED, Spain
Antonio Ferrández       Alicante University, Spain
Bernardo Magnini        ITC-Irst, Italy
Bonnie Webber           University of Edinburgh, UK
Donna Harman            NIST, USA
Ellen Voorhees          NIST, USA
Fabio Rinaldi           University of Zurich, Switzerland
Felisa Verdejo          UNED, Spain
Graeme Hirst            University of Toronto, Canada
Horacio Rodríguez       Universitat de Catalunya, Spain
Ingrid Zukerman         Monash University, Australia
Jimmy Lin               MIT, USA
Johan Bos               University of Edinburgh, UK
Juergen Franke          DaimlerChrysler AG, Germany
Julio Gonzalo           UNED, Spain
Lynette Hirschman       MITRE, USA
Maarten de Rijke        University of Amsterdam, The Netherlands
Manuel Palomar          Alicante University, Spain
Mark Maybury            MITRE, USA
Michael Hess            University of Zurich, Switzerland
Pierre Zweigenbaum      DIAM, France
Richard Sutcliffe       University of Limerick, Ireland
Rolf Schwitter          Macquarie University, Australia
Sanda Harabagiu         University of Texas, USA


IMPORTANT DATES

* 15 March 04        Paper submission
* 15 April 04        Notification of acceptance
* 15 May 04          Camera ready version
* 25 or 26 July 04   Workshop (final date not yet determined)


CONTACT DETAILS

Diego Mollá
Centre for Language Technology
Division of Information and Communication Sciences
Macquarie University
New South Wales 2109
Australia

Tel. +61 2 9850 9531
Fax  +61 2 9850 9551
diego at ics.mq.edu.au



More information about the Corpora mailing list