Appel: Finite-State Methods and Natural Language Processing 2008

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Tue Mar 18 08:44:37 UTC 2008

Date: Fri, 14 Mar 2008 22:07:57 +0100
From: Jakub.Piskorski at
Message-id: <f85586bd7607.47daf73d at>


         Finite-State Methods and Natural Language Processing
                             FSMNLP 2008
                    Seventh International Workshop

                      SECOND  CALL  FOR  PAPERS
                  11-12 September 2008, Ispra, Italy
                contact: fsmnlp2008 [ad] jrc [dot] it


This year FSMNLP is merged with the FASTAR 
(Finite Automata Systems - Theoretical and Applied Research) 
workshop (


The aim of the FSMNLP 2008 is to bring together members of the
research and industrial community working on finite-state based models
in language technology, computational linguistics, web mining,
linguistics, and cognitive science or on related theory and methods in
fields such as computer science and mathematics.  The workshop will be
a forum for researchers and practicioners working

  * on NLP applications,
  * on the theoretical and implementation aspects, or
  * on their combination.

The special theme of FSMNLP 2008 centers around high performance
finite-state devices in large-scale natural language text processing
systems and applications. We invite in particular novel high-quality
papers related to the topics including:
  * practices and experience in deployment of finite-state techniques
    in real-world applications processing massive amount of natural
    language data

  * industrial-strength finite-state pattern engines for information
    retrieval, information extraction and related text-mining tasks

  * scalability issues in FS-based large-scale text processing systems

  * efficient finite-state methods in search engines

  * implementation, construction, compression and processing
    techniques for huge finite-state devices and networks

  * novel application and efficiency-oriented finite-state paradigms
    (compilation and processing), e.g., finite-state devices with rich
    label annotatations, unification-based finite-state devices

  * comparative studies of time and space efficient finite-state
    methods (vs. other techniques) utilized in NLP applications

  * novel appllication areas for finite-state devices in text
    processing and information management systems

  * design patterns for implementing finite-state devices and toolkits

We also invite submissions that are related to the traditional FSMNLP
themes including but not limited to:

1. NLP applications and linguistic aspects of finite-state methods

The topic includes but is not restricted to:

  * speech, sign language, phonology, hyphenation, prosody,
  * scripts, text normalization, segmentation, tokenization, indexing,
  * morphology, stemming, lemmatisation, information retrieval, web
    mining, spelling correction, 
  * syntax, POS tagging, partial parsing, disambiguation, information
    extraction, question answering 
  * machine translation, translation memories, glossing, dialect
  * annotated corpora and treebanks, semi-automatic annotation, error
    mining, searching

2. Finite-state models of language

With this more focused topic (inside 1) we invite papers on aspects
that motivate sufficiency of finite-state methods or their subsets for
capturing various requirements of natural language processing. The
topic includes but is not restricted to:

  * performance, linguistic applicability, finite-state hypotheses
  * Zipf's law and coverage, model checking against finite corpora
  * regular approximations under parameterized complexity, limitations
    and definitions of relevant complexities such as ambiguity,
    recursion, crossings, rule applications, constraint violations,
    reduplication, exponents, discontinuity, path-width, and induction
  * similarity inferences, dissimilation, segmental length,
    counter-freeness, asynchronous machines
  * garden-path sentences, deterministic parsing, expected parses,
    Markov chains
  * incremental parsing, uncertainty, reliability/variance in
    stochastic parsing, linear sequential machines

3. Practices for building lexical transducers for the world's

The topic accounts for usability of finite-state methods in NLP. It
includes but is not restricted to:

  * required user training and consultation, learning curve of
  * questionnaires, discovery methods, adaptive computer-aided
    glossing and interlinearization
  * example-based grammars, unsupervised learning, semi-automatic
    learning, user-driven learning (see topic 5 too)
  * low literacy level and restricted availability of training data,
    writing systems/phonology under development, new non-Roman
    scripts, endangered languages
  * linguist's workbenches, stealth-to-wealth parser development
  * experiences of using existing tools (e.g. TWOL) for computational
    morphology and phonology

4. Specification and implementation of sets, relations and
multiplicities in NLP using finite state devices

The topic includes but is not restricted to:

  * regular rule formalisms, grammar systems, expressions, operations,
    closure properties, complexities
  * algorithms for compilation, approximation, manipulation,
    optimization, and lazy evaluation of finite machines
  * finite string and tree automata, transducers, morphisms and
  * weights, registers, multiple tapes, alphabets, state covers and
    partitions, representations
  * locality, constraint propagation, star-free languages, data
    vs. query complexity
  * logical specification, MSO(SLR,matches), FO(Str,<), LTL,
    generalized restriction, local grammars
  * multi-tape automata, same-length relations and partition-based
    morphology, Semitic morphology
  * autosegmental phonology, shuffle, trajectories, synchronization,
    segmental anchoring, alignment constraints, syllable structure,
    partial-order reductions
  * varieties of regular languages and relations, descriptive
    complexity of finite-state based grammars
  * automaton-based approaches to declarative constraint grammars,
    constraints in optimality theory
  * parallel corpus annotations, register automata, acyclic timed

5. Machine learning of finite-state models of natural language

This topic includes but is not restricted to:

  * learning regular rule systems, learning topologies of finite
    automata and transducers
  * parameter estimation and smoothing, lexical openness
  * computer-driven grammar writing, user-driven grammar learning,
    discovery procedures
  * data scarcity, realistic variations of Gold's model, learnability
    and cognitive science
  * incompletely specified finite-state networks
  * model-theoretic grammars, gradient well/ill-formedness

6. Finite-state manipulation software (with relevance to the above themes)

This topic includes but is not restricted to

  * regular expression pre-compilers such as regexopt, xfst2fsa,
    standards and interfaces for finite-state based software
    components, conversion tools
  * tools such as LEXC, Lextools, Intex, XFST, FSM, GRM, WFSC, FIRE
    Engine, FADD, FSA/UTR, SRILM, FIRE Station and Grail
  * free or almost free software such as MIT FST, Carmel, RWTH FSA,
    FSA Utilities, FSM<2.0>, Unitex, OpenFIRE, OpenFST, Vaucanson,
    (see for
    more examples)
  * results obtainable with such exploration tools as automata,
    Autographe, Amore, and TESTAS
  * visualization tools such as Graphviz and Vaucanson-G
  * language-specific resources and descriptions, freely available
    benchmarking resources

The descriptions of the topics above are not meant to be complete, and
should extend to cover all traditional FSMNLP topics. Submitted papers
or abstracts may fall in several categories.


We expect three kinds of submissions: 

- full papers, 
- short papers, and
- interactive software demos. 

Submissions are electronic and in PDF format via a web-based
submission server.  Authors are encouraged to use Springer LNCS style
(Proceedings and Other Multiauthor Volumes) for LaTeX in producing the
PDF document. More information on this style can be found at:
The page limit for full papers is 12 pages, whereas short papers and
software demo descriptions are limited to 6 pages. The information
about the author(s) should be omitted in the submitted papers since
the review process wil be blind.  More detailed information about
submission is available on:


The papers and abstracts will be published in FSMNLP 2008 proceedings
(paper version).  We are currently negotiating publishing the
postproceedings with a scientific press company.  Publication of
extended and revised versions of the papers in a special journal issue
is planned too.


Paper submissions due:  11 May 
Notification of acceptance:  11 June
Camera-ready versions due: 30 June 


Cyril Allauzen (Google Research, New York, USA) 
Francisco Casacuberta (Instituto Tecnologico De Informática, Valencia, Spain) 
Jean-Marc Champarnaud (Université de Rouen, France) 
Maxime Crochemore (Department of Computer Science, King's College London, U.K.) 
Jan Daciuk (Gdańsk University of Technology, Poland) 
Karin Haenelt (Fraunhofer Gesellschaft and University of Heidelberg, Germany) 
Thomas Hanneforth (University of Potsdam, Germany) 
Colin de la Higuera (Jean Monnet University, Saint-Etienne, France) 
André Kempe (Yahoo Search Technologies, Paris, France) 
Derrick Kourie (Dept. of Computer Science, University of Pretoria, South Africa) 
Andras Kornai (Budapest Institute of Technology, Hungary and MetaCarta, Cambridge, USA) 
Marcus Kracht (Univeristy of California, Los Angeles, USA) 
Hans-Ulrich Krieger (DFKI GmbH, Saarbrücken, Germany) 
Eric Laporte (Université de Marne-la-Vallée, France) 
Stoyan Mihov (Bulgarian Academy of Sciences, Sofia, Bulgaria) 
Herman Ney (RWTH Aachen University, Germany) 
Kemal Oflazer (Sabanci University, Turkey and Carnegie Mellon University, Pittsburgh, USA) 
Jakub Piskorski (Joint Research Center of the European Commission, Italy) 
Michael Riley (Google Research, New York, USA) 
Strahil Ristov (Ruder Boskovic Institute, Zagreb, Croatia) 
Wojciech Rytter (Warsaw University, Poland) 
Jacques Sakarovitch (Ecole nationale supérieure des Télécommunications, Paris, France) 
Max Silberztein (Université de Franche-Comté, France) 
Wojciech Skut (Google Research, Mountain View, USA) 
Bruce Watson (Dept. of Computer Science, University of Pretoria, South Africa) 
Shuly Wintner (University of Haifa, Israel) 
Atro Voutilainen (Connexor Oy, Finland) 
Anssi Yli Jyrä (University of Helsinki and CSC – Scientific Computing Ltd., Espoo, Finland) 
Sheng Yu (University of Western Ontario, Canada) 
Lynette van Zijl (Stellenbosch University, South Africa) 

Message diffuse par la liste Langage Naturel <LN at>
Informations, abonnement :
English version       : 
Archives                 :

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  :

More information about the Ln mailing list