13.2021, Calls: Computational Linguistics

Sat Aug 3 13:09:56 UTC 2002

LINGUIST List:  Vol-13-2021. Sat Aug 3 2002. ISSN: 1068-4875.

Subject: 13.2021, Calls: Computational Linguistics

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Consulting Editor:
        Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Editors (linguist at linguistlist.org):
	Karen Milligan, WSU 		Naomi Ogasawara, EMU
	James Yuells, EMU		Marie Klopfenstein, WSU
	Michael Appleby, EMU		Heather Taylor, EMU
	Ljuba Veselinova, Stockholm U.	Richard John Harvey, EMU
	Dina Kapetangianni, EMU		Renee Galvis, WSU
	Karolina Owczarzak, EMU		Anita Wang, EMU

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>
          Zhenwei Chen, E. Michigan U. <zhenwei at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Dina Kapetangianni <dina at linguistlist.org>
 ==========================================================================

As a matter of policy, LINGUIST discourages the use of abbreviations
or acronyms in conference announcements unless they are explained in
the text.

=================================Directory=================================

1)
Date:  Fri, 2 Aug 2002 11:31:45 +0100
From:  "Roger Harris" <rh at nationalfinder.com>
Subject:  EAMT Workshop: last minute submission

2)
Date:  Fri, 02 Aug 2002 17:53:20 +0200
From:  Marc El-Beze <marc.elbeze at lia.univ-avignon.fr>
Subject:  Call for Papers : special issue of TAL on Language Models

-------------------------------- Message 1 -------------------------------

Date:  Fri, 2 Aug 2002 11:31:45 +0100
From:  "Roger Harris" <rh at nationalfinder.com>
Subject:  EAMT Workshop: last minute submission

6th EAMT Workshop: Teaching Machine Translation
Date: 14 - 15 November 2002
Venue: UMIST, Manchester, England
Web-site: http://www.ccl.umist.ac.uk/events/eamt-bcs/cfp.html
- -------------------------------------------------------------

The deadline for the submission of extended abstracts expired
on Wednesday, 31 July 2002. You may for some reason have missed
that deadline.

Late submissions received by (and preferably before, please)
Thursday 8th August will be welcome. The Call for Papers is
appended below.

With kind regards,

Roger Harris.

- --------------------------------------------------------------------------
- -------------------

Call for Papers

The sixth EAMT Workshop will take place on 14-15 November 2002
hosted by the Centre for Computational Linguistics, UMIST,
Manchester, England.

Organised by the European Association for Machine Translation,
in association with the Natural Language Translation Specialist Group
of the British Computer Society, the Workshop will focus on the topic
of:

                    Teaching Machine Translation

The following topics are of interest:

   why and to whom should MT be taught?
   teaching the theoretical background of MT: linguistics, computer
   science,
   translation theory
   addressing preconceptions about MT in the classroom
   the use of commercial MT programs in hands-on teaching
   teaching computational aspects of MT to non-computational students
   web-based distance learning of MT MT education and industry:
   bridging the gap between academia and the real world
   teaching pre- and post-editing skills to MT users
   teaching MT evaluation
   building modules or `toy' MT systems in the laboratory
   experiences of the evaluation of MT instruction
   the role of MT in language learning
   translation studies and MT
   etc.

We invite submissions of an extended abstract of your proposed paper,
up to two pages, summarizing the main points that will be made in
the actual paper.

Submissions will be reviewed by members of the Programme Committee.
Authors of accepted papers will be asked to submit a full version of
the paper, maximum 12 pages, which will be included in the
proceedings.

A stylefile for accepted submissions will be available in due course.

Initially, an extended abstract should be sent, preferably by email
as an attachment in any of the standard formats (doc, html, pdf, ps)
or as plain text, to Harold.Somers at umist.ac.uk.

Otherwise, hardcopy can be sent to:
Harold Somers, Centre for Computational Linguistics, UMIST, PO Box
88, Manchester M60 1QD, England, or by fax to +44 161 200 3091.

Programme Committee

     Harold Somers, UMIST, Manchester
     Derek Lewis, University of Exeter
     Ruslan Mitkov, University of Wolverhampton
     Mikel Forcada, Universitat d'Alacant
     Karl-Heinz Freigang, Universität des Saarlandes
     David Wigg, South Bank University, London
     John Hutchins, EAMT
     Roger Harris, BCS

Important dates:

Deadline for extended abstract: 31 July 2002:  EXPIRED
Acceptance notification: 6 September 2002
Final copies due: 14 October 2002
Conference dates: 14-15 November 2002

- -----------------------------------------

-------------------------------- Message 2 -------------------------------

Date:  Fri, 02 Aug 2002 17:53:20 +0200
From:  Marc El-Beze <marc.elbeze at lia.univ-avignon.fr>
Subject:  Call for Papers : special issue of TAL on Language Models

  Call for papers (TAL journal):   http://www.atala.org/tal/

        Automated Learning of Language Models
         ============================

Deadline for submission : October 7, 2002
Issue coordinated by
    Michèle Jardino (CNRS, LIMSI),  and
    Marc El-Beze    (LIA, University of Avignon) .

Language Models (LM) play a crucial role in the working of Automated
Natural Language Processing systems, when real-life problems (often
very large ones) are being dealt with. Instances are Speech
Recognition, Machine Translation and Information Retrieval. If we want
these systems to adapt to new applications, or to follow the evolution
in user behaviour, we need to automatize the learning of parameters in
the models we use. Adaptation should occur in advance or in real
time. Some applications do not allow us to build an adequate corpus,
either from a quantitative or qualitative point of view. The gathering
of learning data is made easier by the richness of Web resources, but
in that huge mass, we have to effectively separate the wheat from the
chaff.

When asked about the optimal size for a learning corpus, are we
satisfied to answer "The bigger, the better"?
Rather than training one LM on a gigantic learning corpus, would it not
be advisable to fragment this corpus into linguistically coherent
segments, and learn several language models, whose scores might be
combined when doing the test (model mixture)?
In the case of n-gram models, which is the optimal value for n? Should
it be fixed or variable?
A larger value allows us to capture linguistic constraints over a
context which goes beyond the mere two preceding words of the classic
trigram. However, increasing n threatens us with serious coverage
problems. Which is the best trade-off between these two opposite
constraints?

How can we smooth models in order to approximate phenomena that have
not been learned? Which alternatives are to be chosen, using which
more general information (lesser-order n-grams, n-classes?)

Beyond the traditional opposition between numerical and
knowledge-based approaches, there is a consensus about the
introduction of rules into stochastic models, or probability into
grammars, hoping to get the best of both strategies. Hybrid models can
be conceived in several ways, depending on which choices are made
regarding both of their sides, and also, the place where coupling
occurs. Because of discrepancies between the language a grammar
generates, and actually observed syntagms, some researchers decided to
reverse the situation and derive the grammar from observed
facts. However, this method yields disappointing results, since it
does not perform any better than n -gram methods, and is perhaps
inferior. Shouldn't we introduce here a good deal of supervision, if
we want to reach this goal?

Topics (non-exhaustive list):
====================
In this special issue, we would like to publish either innovative
papers, or surveys and prospective essays dealing with Language Models
(LM), Automated Learning of their parameters, and covering one of
following subtopics:

    * Language Models and Resources:
          - determination of the adequate lexicon
          - determination of the adequate corpus
    * Topical Models
    * LM with fixed or variable history
    * Probabilistic Grammars
    * Grammatical Inference
    * Hybrid Language Models
    * Static and dynamic adaptation of LMs
    * Dealing with the Unknown
          - Modelling words which do not belong to the vocabulary
          - Methods for smoothing LMs
    * Supervised and unsupervised learning of LMs
          - Automated classification of basic units
          - Introducing linguistic knowledge into LMs
    * Methods for LM learning
          - EM, MMI, others?
    * Evaluation of Language Models
    * Complexity and LM theory
    * Applications:
      - Speech Recognition
      - Machine Translation
      - Information Retrieval

Format:
=====
Papers (25 pages maximum) are to be submitted in Word ou LaTeX.
Style sheets are available at HERMES :  http://www.hermes-science.com/
http://www.hermes-science.com/>

Language:
=======
Articles can be written either in French or in English, but English will
be accepted from non-French speaking authors only.

Deadlines:
=======
Submission deadline is October 7, 2002. Authors who plan to submit a
paper are invited to contact
Michèle Jardino and / or Marc El-Beze  ( mailto:tal.ml at limsi.fr ) before
September 15,   2002.
Articles will be reviewed by a member of the editorial board and two
external reviewers designed by the editors of this issue. Decisions of
the editorial board and referees' report will be transmitted to the
authors before November 20, 2002.
The final version of the accepted papers will be required by February
20, 2003. Publication is planned during the spring of 2003.

Submission:
========
Submissions must be sent electronically to:
Michèle Jardino  ( mailto:jardino at limsi.fr )
Marc El-Bèze   ( mailto:marc.elbeze at lia.univ-avignon.fr )

or, in paper version (four copies), posted to:
Marc El-Beze Laboratoire d'Informatique
LIA - CERI BP 1228
84 911 AVIGNON CEDEX 9 FRANCE

---------------------------------------------------------------------------
LINGUIST List: Vol-13-2021