                        CALL FOR PARTICIPATION TO

                          PAPILLON-2004 Workshop
                    on Multilingual Lexical Databases
               Grenoble, August 30th-September  1st, 2004

                      immediatly  after COLING 2004

                Venue: IMAG Institute, Grenoble, France


Multilingual lexical databases are (i) databases for (ii) structured
lexical data which can be used either (iii) by humans e.g. to define
their own dictionaries or (iv) by natural language processing (NLP)
applications. Such databases are now felt indispensable in language
science with the advances of language engineering. Like databases in
genomics, multilingual lexical databases need rich annotations; they
are complex, and they evolve as time goes by.

The Papillon project is a Web collaborative project with the aim to
build an open source multilingual lexical database for several
languages (French, German, English, Japanese, Lao, Malay, Thai and
Vietnamese). The provided lexical information has to be rich enough
for a human to be able to query and generate his/her own tailored
dictionary (e.g. for language learning or for translation work) and
for NLP applications to be able to extract a whole range of data or to
directly exploit some particular data.

The 2004 Papillon workshop, the fifth in a series of workshops
organized every year by the Papillon members, will aim at identifying
problems relevant to the multilingual-lexical-database community. The
workshop aims to promote exchanges between practitioners from several
fields and is thus open to anybody working in a domain pertaining to
lexical databases such as: databases, man-machine interface for
dictionaries, data annotation, XML, standardization of dictionaries or
lexical data; lexicography, translation, computational linguistics,

Tentative Program

The program will have a varied format, designed to maximize
cross-fertilization among the various specialties, and to allow
extended open discussion. Components of the program will include:

- Tutorials on relevant models from linguistics, databases or
annotation, e.g. the structure of lexical entries and semi-structured
query languages;

- Panel sessions on annotated text and lexicons (and possibly

- Paper presentations reporting new research;

- Demonstrations of systems for creating and/or managing lexical

The following papers will be presented during the conference:

1.  Refining Algorithm of Extracted Pattern Rule Set from Penn
TreeBank Corpus. Akira Adachi, and Takenori Makino

2.  LC-STAR: XML-coded Phonetic Lexica and Bilingual Corpora for
Speech to Speech Translation. Folkert de Vriend, Nuria Castell, Jesus
Giménez, and Giulio Maltese

3.  Low Cost Automated Conceptual Vector Generation from Mono and
Bilingual Resources. Mathieu Lafourcade, Frédéric Rodrigo, and Didier

4.  ITOLDU: Accessing to Vocabulary learning in a technical English
resource pooling environment. Valérie Bellynck, and John Kenwright

5.  Ressource pooling for technical English learning via lexical
access. Valérie Bellynck, Christian Boitet, and John Kenwright

6.  Electronic Data for the Description of Japanese Kanji - The
Analyses of Brush Strokes, Stroke Groups and their Position and the
Building of Path Data to Display and Search Kanji. Ulrich Apel, and
Julien Quint

7.  Why have them work for peanuts, when it is so easy to provide
reward? One of the many possibilities of a dictionary converted into a
drill tutor. Michael Zock, and Julien Quint

8.  Multilingual Dictionary of Lexicographical Terms.  Svetlana
Krestova, and Peter J. Nürnberg

9.  Expanding the Lexicon: the Search for Abbreviations.  James Breen

10.  The Design of (Psycho)Linguistically-motivated Lexicons for
Natural Language Processing. Ariani Di Filippo, Bento Carlos

11.  Building a Specialised Multilingual Dictionary from General
Monolingual Dictionaries. Choy-Kim Chuah

12.  A semantic representation of emotions based on a dialogue corpus
analysis. Mutsuko Tomokiyo, and Solange Hollard

13.  An XML-based Tool for Tracking English Inclusions in German
Text. Beatrice Alex, and Claire Grover

14.  Historical-Comparative Reconstruction and Multilingual
Lexica. James Kilbury, and Katina Bontcheva

15.  Building an Ontology-based Multilingual Lexicon for Word Sense
Disambiguation in Machine Translation. Lian-Tze Lim and Tang Enya Kong


Registration fee for the Papillon  2004 workshop is fixed at 50 euros.

This Registration fee includes:

- Attendance at all sessions

- Coffee and refreshments at official breaks

- Official diner Tuesday 31st of August

Registration fee will be payable in cash at the registration desk.
Please, pre-register to the conference by sending a mail to
papillon2004 at with your name.


Papillon 2004 workshop will take place at the "Maison Jean Kuntzman"
amphitheater of the IMAG institute on Grenoble university's campus
(Site de Saint Martin d'Hères et Gières).Directions to reach the
"Maison Jean Kuntzman" are available at

Miscellaneous Information

- Papillon project Web site:


- IMAG Institute:

- Grenoble tourist information:


For any enquiry, please contact the Papillon 2004 organizers at
papillon2004 at

