Appel: Workshop on Machine Learning for Information Extraction

Thu Feb 3 15:15:58 UTC 2000

From: Thierry Poibeau <Thierry.Poibeau at lcr.thomson-csf.com>

Workshop on Machine Learning for Information Extraction (Call for
submission)
=====================================================================

Monday 21 August 2000

For workshop information, see:
http://ecate.itc.it:1025/cirave/ecai-workshop.html
For ECAI information, see: http://www.ecai2000.hu-berlin.de/

to be held in conjunction with the 14th European Conference on
Artificial Intelligence (ECAI), BERLIN, HUMBOLDT UNIVERSITY

Fabio Ciravegna (contact) ITC-irst Centro per la Ricerca Scientifica e
Tecnologica,
Roberto Basili Universitity of Roma Tor Vergata,
Robert Gaizauskas, University of Sheffield

The exponential increase in the quantity of  textual information held in
digital archives has fuelled growing interest in computer-assisted
techniques for information extraction  from text (IE).  IE systems, as
understood by the applied natural language processing community,
identify predetermined relevant information in text documents from some
specific domain. Once extracted, the information can be used for a
number of purposes: database population, text indexing, information
highlighting, and so on. While significant progress in constructing such
systems has been made, stimulated in particular by the DARPA Message
Understanding Conferences, by general agreement the main barriers to
wider use and commercialisation of IE are the difficulties in adapting
systems to new applications and domains. Porting IE systems is generally
both difficult and expensive, given the current technology, since
changes generally need to be carried out manually by highly skilled
experts. Moreover some sources (e.g. Web pages) may change very rapidly
in both format and content. Tracking all the changes and continuously
re-adapting IE systems is very expensive or even unfeasible if done
manually.

To address these difficulties there has been increasing interest in
applying machine learning (ML) techniques to Information Extraction from
text. Tasks to which ML has been applied include template design,
template filling, named entity recognition and resource compilation
(e.g. lexicons, knowledge structures, grammars). The kind of sources
analysed range from structured texts (e.g. Web pages) to semi-structured
texts (e.g. rental ads) to free texts (e.g. newspaper articles). ML
techniques which have been used range from symbolic (e.g. inductive
logic programming, transformation-based learning, etc.) to numerical
methods (e.g. naive-Bayes, maximum entropy, etc.).However, the current
situation is characterized by isolated experiments in which individual
ML techniques are applied to specific IE tasks. What is lacking is a
unifying view of the issue of adopting ML techniques for IE.

The proposed workshop aims to establish a forum for discussing current
and future trends of the application of ML to IE, with a specific focus
on the identification of a unifying view of the issue. The workshop has
the following goals:

* to bring together communities of researchers that address the ML for
IE problem from different perspectives (e.g., natural language
processing, information retrieval, machine learning, information
integration);
* to deepen the European IE community's understanding of the state of
the art;
* to identify further IE-related problems for which ML techniques might
be appropriate.

Particularly welcomed are contributions concerning:
* descriptions of techniques adaptable for different languages, tasks
and/or text typologies;
* proposals of unifying views on the current or future application of ML
to IE.

In the interest of promoting as much discussion as possible, the number
of paper presentations will be limited in favour of panels and posters.
A final panel will discuss the research agenda for the coming years.

Attendance will be limited to 30 participants

Formatting  Guidelines
Follow the formatting guidelines for  ECAI papers .

Important dates

Submission deadline:  12 March 2000
Notification of acceptance:  7 May 2000
Camera-ready versions of accepted papers due:  7 June 2000

For any information please contact Fabio Ciravegna (cirave at irst.itc.it)
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: message.footer
URL: <http://listserv.linguistlist.org/pipermail/ln/attachments/20000203/08cd7d80/attachment.ksh>