Corpora: LREC WOrkshop call
Roberta Catizone
r.catizone at dcs.shef.ac.uk
Mon Mar 11 10:21:57 UTC 2002
Call for Papers
******Extended Deadline for Submission******
'Event Modelling for Multilingual Document Linking'
LREC 2002 Workshop
2nd June 2002, Las Palmas, Canary Islands - Spain
http://www.lrec-conf.org/lrec2002/
This Workshop, although focussed on specific issues in Learning
applied to Content/Event Modelling of Multilingual Documents (especially
the role of verb representations) will also be a forum for discussing
more general issues of Learning in Advanced HLT applications.
Specific Motivation and Aims
There is a growing need for modelling the content of multilingual
documents for purposes such as IR and IE which rest on classification
and document similarity.
This workshop aims to discuss the issues surrounding content modelling of
multilingual documents, and in particular the issue of whether event
representations can function as cross-document links.Specific Issues are:
1. How large-scale language resources can be used in Content Modelling.
2. How NLP techniques can be used to full advantage in event/content modelling.
Can generic tools be used or is it inevitable that tools developed be
application specific.
3. Discussion of learning techniques that have been applied to such tasks.
4. Discussion of applications that use content/event modelling as an
intermediate stage or end result in document linkage.
5. Issues concerning evaluation of how well the content is modelled.
6. Multilingual document analysis
General Motivation and Aims:
The application of HLT to current IT trends requires large amounts of
specific linguistic resources. However, existing large scale resources
are not normally intended (i.e. designed and handcrafted) for specific
application tasks.
In order to bridge the gap with specific applications, a variety of methods for
acquisition, adaptation and integration of linguistic resources have
been proposed in the NLP research area since the late 80's.
Machine learning and statistical techniques have been employed
as devices to deal with the scale and the complexity of the
problem. Although a substantial area of research, the impact of these
technologies on applications is still low with respect to their
potential. Open problems are:
- the unclear targets of the learning activity: no general consensus
exists among the proposed approaches as to the quality and quantity of
linguistic information needed for different tasks (e.g. what is the most
suitable representation that captures selective information from the LR
training material able to optimize parsing accuracy? Is it fully
grammatical, like in bracketed corpora, or lexical);
- the heterogeneity of sources: relevant information for the adaptation
task can be distributed in different repositories (LKBs and texts) or
expressed differently (in different languages and/or raw, e.g. texts,
vs. semistructured data, e.g. HTML/XML formats);
- the architectural idiosyncrasies: proposed learning systems make
reference to different sources of information in different pipelined (or
redundant as in voting) application architectures.
- the application scope: current applications make a limited use (if
any) of available adaptation technologies. This often limits the scale
reachable by the current HLT aplications;
The above issues are orientated towards reducing the complexity of the problem
in current research, given the enormous potential of the application
field in areas like Web Mining, Q\A and Knowledge Management.
This workshop aims to bring together researchers of both academic and
industrial organizations interested in:
- Theoretical and Practical aspects of adaptive Natural Language
Processing
- Models of Acquisition and Integration of Domain Knowledge
- Integration of induction models from heterogeneus data (lexicons vs.
ontologies, texts vs. HTML/XML pages)
- Learning Multlingual Information by exploiting Multilingual Resources
(e.g. EWN)
- Theoretical and Practical aspects of Lexical Acquisition in
multilingual scenarios
- Architectures for learning, adaptation, and integration of LR
- Adaptive HLT applications (including but not limited to search,
retrieval, navigation and Q/A)
- What level of representation (terms, event structures etc.) are
most appropriate for document linkage.
Papers are invited for presenting theoretical and methodological aspects
of Machine Learning of Natural Language as well as approaches making
effective use of adaptive methods in the perspective of pre-industrial
or industrial applications.
Program Committee
Roberta Catizone University of Sheffield
Walter Daelemans CNTS/Language Technology Group, Antwerp
M. V. Marabello KnowledgeStones S.p.A
M. T. Pazienza University of Roma, Tor Vergata
G. Rigau Polytechnical University of Catalunia
Horatio Rodriguez Polytechnical University of Catalunia
A. Setzer University of Sheffield
N. Webb University of Sheffield
Y. Wilks University of Sheffield
Rémi Zajac Systran Software, CA
F.M. Zanzotto University of Roma, Tor Vergata
Contact person
Roberta Catizone
University of Sheffield
211 Portobello Street, Regent Court, S1 4DP Sheffield (UK)
phone: +44 114 2221897; fax +44 114 2221810
r.catizone at dcs.shef.ac.uk
Time schedule (Important Dates)
Deadline for workshop abstract submission: 20th of March 2002
Notification of acceptance: 27th of March 2002
Final version of paper for proceedings: 15th of April 2002
Workshop: 2st of June 2002
Agenda
Morning Session:
- 1st Invited Talk (8:00-9:00)
- Technical Papers (9:00-11:30)
- 2nd Invited Talk (11:30-12:30)
- Panel and Round Table (12:30-1:30)
A summary of the intended workshop Call for Participation.
In the workshop the following invited speakers are expected:
- Roberto Basili (University of Roma, Tor Vergata)
- Fabio Ciravegna (University of Sheffield)
A panel session on "Adaptive Technologies and their implications on
advanced HLT applications (IR, IE, Q&A and KM)"
Distinguished panelists will be invited, some of whom have confirmed their
participation:
- Nino Varile (EC Commission)
- F. Gardin (AISoftware)
Submissions
Papers should describe existing research connected to the topics
of the workshop. The presentation at the workshop will be 30
minutes long (20 minutes for presentation and 10 minutes for
questions and discussion). Each submission should show: title;
author(s); affiliation(s); and contact author's e-mail address,
postal address, telephone and fax numbers. Abstracts (maximum
2 pages, plain-text format).
The final version of the accepted papers should be no longer than
10 A4 pages. Instructions for formatting and presentation of
the final version will be sent to authors upon notification
of acceptance.
The Workshop was Previously named 'Learning for Advanced HLT Applications:
from Language Resources to Processes'
More information about the Corpora
mailing list