14.686, Calls: Grammar Development/Computational Ling

LINGUIST List linguist at linguistlist.org
Mon Mar 10 20:37:30 UTC 2003


LINGUIST List:  Vol-14-686. Mon Mar 10 2003. ISSN: 1068-4875.

Subject: 14.686, Calls: Grammar Development/Computational Ling

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>

Reviews (reviews at linguistlist.org):
	Simin Karimi, U. of Arizona
	Terence Langendoen, U. of Arizona

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.

Editor for this issue: Marie Klopfenstein <marie at linguistlist.org>
 ==========================================================================
FUND DRIVE 2003

Please help us reach our total of $50,000 by making a donation at:

http://linguistlist.org/donation.html

The LINGUIST List depends on the generous contributions from
subscribers like you; we would not be able to operate without your
help.

The moderators, staff, and student editors at LINGUIST would like to
take this opportunity to thank you for your continuous support.

As a matter of policy, LINGUIST discourages the use of abbreviations
or acronyms in conference announcements unless they are explained in
the text.

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.html.

=================================Directory=================================

1)
Date:  Mon, 10 Mar 2003 11:57:45 +0000
From:  fouvry at coli.uni-sb.de
Subject:  Multilingual Grammar Development

2)
Date:  Fri, 7 Mar 2003 16:54:26 EST
From:  Priscilla Rasmussen <rasmusse at cs.rutgers.edu>
Subject:  ACL 2003

-------------------------------- Message 1 -------------------------------

Date:  Mon, 10 Mar 2003 11:57:45 +0000
From:  fouvry at coli.uni-sb.de
Subject:  Multilingual Grammar Development


Ideas and Strategies for Multilingual Grammar Development

Location: Vienna, Austria
Date: 25-AUG-03 - 29-AUG-03

Call Deadline: 14-Mar-2003

Contact Person: Melanie Siegel
Meeting Email: siegel at dfki.de
Linguistic Subfield(s): Computational Linguistics


This is a session of the following conference:
15th European Summer School in Logic, Language and Information

Meeting Description:

In this workshop at the 2003 European Summer School in Logic,
Language, and Information (ESSLLI2003) in Vienna, participants will
address the issue of building a methodology for parallel grammar
development in linguistically rich frameworks. This methodology should
guide the definitions of common formats, procedures, development
tools, grammar components, and documentation practice, as well as
standardized evaluation methods.
				
Final Call for Papers

ESSLLI Workshop
25 to 29 August 2003

Ideas and Strategies for Multilingual Grammar Development

Taking place during ESSLLI 2003 (18-29 August), Vienna
http://www.logic.at/esslli03/

Workshop Website: http://www.dfki.uni-sb.de/~siegel/esslli/

In this workshop at the 2003 European Summer School in Logic,
Language, and Information (ESSLLI2003) in Vienna, participants will
address the issue of building a methodology for parallel grammar
development in linguistically rich frameworks. This methodology should
guide the definitions of common formats, procedures, development
tools, grammar components, and documentation practice, as well as
standardized evaluation methods.

Topics of the workshop include:

- Methodology for multilingual, broad-coverage, deep grammar
  development within linguistically rich frameworks such as those
  using unification-based grammars. Such approaches may include
  grammar templates, external specifications, or other tactics.

- Guidelines for grammar writers either in the initial stages of
  development or as long-term best practices.

- Organization of a grammar in layered, reusable structures.

- Strategies for adapting existing resources such as taggers or
  morphological analyzers.

The workshop will be held during the second week of ESSLLI2003, 25-29
August 2003, with each of the five sessions allowing for the
presentation of three 20-minute papers followed by discussion.

Submission details:

Abstracts of not more than four pages (with a minimum font size of
11pt and margins of 2.5cm) on any of the above topics are due by
Friday, 14 March 2003 with electronic submission in either PostScript
or PDF format, to multigram at coli.uni-sb.de.

Reviewing will be done anonymously, and the final program will be
determined by the workshop organizers based on these reviews. Authors
will be advised of the results by Monday, 5 May 2003. Full papers to
be included in the workshop proceedings will be due by Saturday, 24
May 2003.


-------------------------------- Message 2 -------------------------------

Date:  Fri, 7 Mar 2003 16:54:26 EST
From:  Priscilla Rasmussen <rasmusse at cs.rutgers.edu>
Subject:  ACL 2003


**************************************************************
ACL2003 News Letter No.3                  (4th of March, 2003)
**************************************************************
Hitoshi Isahara (Publicity Chair, CRL) and Masaki Murata (CRL)
- ---------------------------------------------------------------

Venue: Convention Center of Sapporo, Sapporo, JAPAN
Dates: Tutorials and Pre-conference Workshops: July 7, 2003
        Main Conference: July 8-10, 2003
        Post-conference Workshops: July 11-12, 2003
(For details, see the Web site http://www.ec-inc.co.jp/ACL2003/)

This news letter includes
1) News from Program Committee of Main Conference
2) Extended Deadline of Student Research Workshop
3) Life Time Achievement Award
4) Abstracts of Tutorials
    4-1) Finite State Language Processing
    4-2) Maximum Entropy Models, Conditional Estimation, and Optimization
         without the Magic
    4-3) Knowledge Discovery from Text
    4-4) Spoken Language Processing: Separating Science Fact from Science
5) Deadlines and Web Sites
    5-1) Student Research Workshop
    5-2) Interactive Poster/Demo Sessions
    5-3) Associated Conferences (EMNLP2003 and IRAL2003)
    5-4) ACL Workshops
    5-5) Exhibits and Sponsorship
6) Important Announcements from Several Associated Conferences and Workshops

- ---------------------------------------------------------------
1) News from Program Committee of Main Conference

376 papers were submitted to the main conference. This is far more
than we expected. Thank you for your interest in ACL2003.

- ---------------------------------------------------------------
2) Extended Deadline of Student Research Workshop

The paper submission deadline of Student Research Workshop was
extended:

Paper submission deadline: March 15, 2003 (extended)
(Note that we NO LONGER require early registration of papers.)
Web site:   http://tangra.si.umich.edu/clair/acl03-student/

We would appreciate it if you could inform your students that the
deadline has been extended.

- ---------------------------------------------------------------
3) Life Time Achievement Award

A ceremony for the second Life Time Achievement Award will be held
during ACL 2003. The LTA was established at the 40th anniversary
conference of ACL last year. The first winner of the LTA was
Prof. Aravind Joshi of the University of Pennsylvania.

- ---------------------------------------------------------------
4) Abstracts of Tutorials

There will be four tutorials, to be given by leading experts in
language and speech processing. The tutorials will take place on July
7. The abstracts of the tutorials and the profiles of the speakers
will be described on the ACL-03 web site. For details, see the Web
site http://www.ec-inc.co.jp/ACL2003/Tutorials.html.

- -----
4-1) Finite State Language Processing
Gertjan van Noord (University of Groningen, The Netherlands)

Finite state automata are well-understood, and inherently compact and
efficient models of simple languages. In addition, finite state
automata can be combined in various interesting ways, with the
guarantee that the result again is a finite state automaton.

In the introductory part of the tutorial, finite state acceptors and
finite state transducers (both weighted and unweighted) are
introduced, and we briefly review their formal and computational
properties.

In the second part of the tutorial, we illustrate the use of finite
state methods in dictionary construction. In particular, we present an
application of perfect hash automata in tuple dictionaries.  Tuple
dictionaries provide a very compact representation of huge language
models of the kind typically used in NLP applications (including Ngram
language models).

In the third part of the tutorial we focus on regular expressions for
NLP. The type of regular expressions used in modern NLP applications
has evolved dramatically from the regular expressions found in
standard Computer Science textbooks.  In recent years, various high
level regular expression operators have been introduced (such as
contexted replacement operators). The availability of more and more
abstract operators make the regular expression notation more and more
attractive.  The tutorial provides an introduction into the regular
expression calculus. The examples use the notation of the Fsa
Utilities toolkit: a freely available implementation of the regular
expression calculus. We introduce various regular expression operators
for acceptors and transducers.  We then continue to show how new
regular expression operators can be defined.

In the last part of the tutorial, we focus in more detail on regular
expression operators that turned out to be useful for the description
of certain aspects of phonology using ideas from Optimality
Theory. This part of the tutorial describes the lenient composition
operator of Karttunen, and the optimality operator of Gerdemann and
van Noord, as well as a number of alternatives (Eisner, Jaeger).

- -----
4-2) Maximum Entropy Models, Conditional Estimation, and Optimization
without the Magic
Dan Klein and Christopher D. Manning (Stanford University, USA)

This tutorial presents the foundations of maximum entropy models,
optimization methods to learn them, and various issues in the use of
graphical models more complex than simple naive-Bayes (NB) or HMM
models.  The focus is on intuition and understanding, using visual
illustrations and simple examples rather than detailed derivations
whenever possible.

Maximum Entropy Models: What maximum entropy models are, from first
principles, what they can and cannot do, and how they behave.  Lots of
examples.  The equivalence of maxent models and maximum-likelihood
exponential models.  The relationship between maxent models and other
classifiers.  Smoothing methods for maxent models.

Basic Optimization: Unconstrained optimization: convexity, gradient
methods (both simple descent and more practical conjugate methods).
Constrained optimization: Lagrange multipliers and several ways of
turning them into a concrete optimization system.  Other fun things to
do with optimization.  Specialized iterative scaling methods
vs. general optimization.

Model Structures: Conditional independence in graphical models
(focusing on NB, HMMs, and PCFGs).  Practical ramifications of various
independence assumptions.  Label and observation biases in conditional
structures.  Survey of sequence models (HMMs, MEMMs, CRFs, and
dependency networks).

Prerequisites: Familiarity with basic calculus and a working knowledge
of NB and HMMs are required.  Existent but possibly vague knowledge of
general Bayes' nets or basic information theory is a plus.  Most
importantly: a low tolerance for conceptual black boxes labeled "magic
here".

- -----
4-3) Knowledge Discovery from Text
Dan Moldovan (University of Texas at Dallas, USA)
Roxana Girju (Baylor University, USA)

Knowledge Discovery is a fast growing area of research and commercial
interest. While knowledge may be discovered from many sources of
information, this tutorial focuses on the discovery of knowledge from
open texts, the largest source of knowledge. The problem of Knowledge
Discovery from Text (KDT) is to extract explicit and implicit concepts
and semantic relations between concepts using Natural Language
Processing techniques. The discovery process is guided by the notion
of context specified either by seed concepts or in some other more
formal way.

KDT, while deeply rooted in NLP, actually draws on methods from
statistics, machine learning, reasoning, information extraction,
knowledge management, cognitive science and others for its discovery
process. The emphasis here is on the automatic discovery of new
concepts and on the large number of semantic relations that link them.
This tutorial presents recent results from KDT research and system
implementations.

Since the goal of KDT is to get insights into large quantities of text
data and bring to bear text semantics, it plays an increasingly
significant role in emerging applications, such as Question Answering,
Summarization, Text Understanding and Ontology Development.

This tutorial is aimed at researchers, practitioners, educators, and
research planners who want to keep in sync with the newly emerging KDT
technology.

- -----
4-4) Spoken Language Processing: Separating Science Fact from Science
Fiction
Roger K. Moore (20/20 Speech Ltd, UK)

The advent of talking and listening machines has long been hailed as
"the next big thing" in human-machine interaction.  Indeed only
recently, the IEEE Spectrum magazine (September 2002) named speech as
one of five technologies likely to reap big market rewards in the next
five years.  Certainly, the frequency with which members of the
general public come across speech-enabled applications in their
everyday lives does seem to be on the increase, and the marketplace is
currently able to support a number of sizeable commercial companies
who are supplying speech-based products and services - as well as a
growing academic community of speech scientists and engineers.  This
apparent progress has been fuelled by a number of key developments:
the relentless increase in available computing power, the introduction
of 'data-driven' techniques for speech pattern modelling, and the
institution of public system evaluations.

This tutorial will chart the main advances that have been made in
spoken language processing algorithms and applications over the past
few years.  The key enabling technologies of 'automatic speech
recognition', 'text-to-speech synthesis' and 'spoken language
dialogue' will be explained in some detail, with emphasis being placed
on how the technology works and, perhaps more importantly, why it
sometimes doesn't.  Insight will also be given into the
linguistic/paralinguistic properties of speech signals and human
spoken language, and comparisons will be drawn between the
capabilities of 'automatic' and 'natural' spoken language processing
systems.

The tutorial is aimed at both specialists and non-specialists in the
language prcessing field, and will be of great interest to anyone who
is keen to develop a greater understanding of the main issues involved
in spoken language processing.  Prof. Moore will cover theoretical and
practical aspects of the inner workings of state-of-the-art spoken
language systems, as well as providing a balanced overview of their
capabilities in relation to other modes of human-machine interaction.

The tutorial will incorporate question-and-answer opportunities, and
will conclude with a survey of open research issues and some
predictions for the future.

- ---------------------------------------------------------------
5) Deadlines and Web Sites

The student research workshop, the interactive poster/demo sessions,
the associated conferences (EMNLP2003 and IRAL2003) and the workshops
have their own submission deadlines and sites. Please see the web
sites for the details.

- -----
5-1) Student Research Workshop

Paper submission deadline: March 15, 2003 (extended)
Web site: http://tangra.si.umich.edu/clair/acl03-student/

- -----
5-2) Interactive Poster/Demo Sessions

Paper submission deadline: May 1, 2003
Web site: http://cl.aist-nara.ac.jp/staff/matsu/poster.html

- -----
5-3) Associated Conferences (EMNLP2003 and IRAL2003)

AC1 The Eighth Conference on Empirical Methods in Natural Language
Processing (EMNLP2003)
Submission deadline: April 4, 2003
Conference date:     July 11-12, 2003
Web site: http://www.ai.mit.edu/people/mcollins/emnlp03.html

AC2 The Sixth International Workshop on Information Retrieval with
Asian Languages (IRAL2003)
Submission deadline: April 15, 2003
Conference date:     July 7, 2003
Web site: http://research.nii.ac.jp/IRAL2003/

- -----
5-4) ACL Workshops

WS1 Multilingual Summarization and Question Answering - Machine
Learning and Beyond
Submission deadline: April 21, 2003
Workshop date:       July 11-12, 2003
Web site: http://www.isi.edu/~cyl/msqa-ml-acl2003/

WS2 Natural Language Processing in Biomedicine
Submission deadline: April 10, 2003
Workshop date:       July 11, 2003
Web site: http://www-tsujii.is.s.u-tokyo.ac.jp/ACL03/bionlp.htm

WS3 The Lexicon and Figurative Language
Submission deadline: April 13, 2003
Workshop date:       July 11, 2003
Web site: http://www.cs.bham.ac.uk/~amw/ACLWorkshop.html

WS4 Multilingual and Mixed-language Named Entity Recognition:
Combining Statistical and Symbolic Models
Submission deadline: April 4, 2003
Workshop date:       July 12, 2003
Web site: http://research.microsoft.com/conferences/mulner-acl03/

WS5 The Second International Workshop on Paraphrasing: Paraphrase
Acquisition and Applications
Submission deadline: April 21, 2003
Workshop date:       July 11, 2003
Web site: http://nlp.nagaokaut.ac.jp/IWP2003/

WS6 Second SIGHAN Workshop on Chinese Language Processing
Deadline: the workshop submission deadline: March 10, 2003
Deadline: the word segmentation bakeoff:    April 22-25, 2003
Workshop date:                              July 11-12, 2003
URL: the workshop: http://www.sighan.org/swclp2/
URL: the bakeoff:  http://www.sighan.org/bakeoff2003/

WS7 Multiword Expressions: Analysis, Acquisition and Treatment
Submission deadline: April 5, 2003
Workshop date:       July 12, 2003
Web site: http://www.cl.cam.ac.uk/users/alk23/mwe/mwe.html

WS8 Linguistic Annotation: Getting the Model Right
Submission deadline: April 5, 2003
Workshop date:       July 11, 2003
Web site: http://www.cs.vassar.edu/~ide/events/ACL2003-LR/

WS9 Workshop on Patent Corpus Processing
Submission deadline: April 10, 2003
Workshop date:       July 12, 2003
Web site: http://www.slis.tsukuba.ac.jp/~fujii/acl2003ws.html

WS10 Towards a Resources Information Infrastructure
Submission deadline: April 13, 2003
Workshop date:       July 11-12, 2003
Web site: http://www.elsnet.org/acl2003-workshop/

- -----
5-5) Exhibits and Sponsorship

Application Deadline for both: April 1, 2003
For details, see Exhibits and Sponsorship
at http://www.ec-inc.co.jp/ACL2003/.

- ---------------------------------------------------------------
6) Important Announcements from Several Associated Conferences and Workshops

- -----
AC1 The Eighth Conference on Empirical Methods in Natural Language
Processing (EMNLP2003)

Abstract: SIGDAT, the Association for Computational Linguistics'
special interest group on linguistic data and corpus-based approaches
to NLP, invites submissions to EMNLP 2003.  The conference will be
held on July 11-12 in Sapporo, Japan, immediately following the 41st
meeting of the ACL (ACL 2003).

URL: http://www.ai.mit.edu/people/mcollins/emnlp03
Deadline: 4 April 2003

- -----
WS1 Multilingual Summarization and Question Answering - Machine
Learning and Beyond

Abstract: Automatic summarization and question answering (QA) aim at
producing a concise representation of the key information
content. Rule-based or statistical-based approaches to summarization
and QA systems have shown promising results; it is, however, very
difficult to find good evaluation functions or rules that work well
across domains. In consequence, various machine learning (ML)
techniques have recently been applied to summarization and QA systems.
The purpose of this workshop is to provide a forum for exploring the
commonality underling this diversity of problem domains and
approaches.

Deadline: 21 April 2003

- -----
WS2 Natural Language Processing in Biomedicine

Invited speaker: Prof. Carol Friedman, CUNY/ Columbia University
'Opportunities and Challenges for NLP in Biomedicine'

The aim of this workshop is to bring together NLP researchers in
biomedicine and to discuss recent advances in the computational analysis
of text, which go beyond traditional keyword-based indexing methods and
begin to offer content-based analysis. Knowledge discovery in the rapidly
growing area of biomedicine is of paramount importance. Processing
biomedical texts is a challenge especially in the areas of terminology,
ontology building, information extraction, annotation tools, sharing and
integration of knowledge from factual and textual data bases and
evaluation of biomedical applications among others. One of the aims of the
workshop is to create SIGs in areas of common interest such as annotation
standards in biology, evaluation metrics, standardisation of
terminological resources etc

Submission deadline:  April 10, 2003
Workshop date:        July 11, 2003

Web site: http://www-tsujii.is.s.u-tokyo.ac.jp/ACL03/bionlp.htm

- -----
WS3 The Lexicon and Figurative Language

Abstract: The lexicon has variously been treated as a list of word
senses, a list of hierarchically related senses, (e.g. WordNet), and
as a structured entity containing rich lexical representations and
means to generate novel uses of words. Figurative language poses
problems for all these approaches, and a common claim is that metaphor
is a cognitive not a linguistic phenomenon; instead, word senses are
related in terms of their underlying conceptual domains. The major
theme of this SIGLEX endorsed workshop is to explore and attempt to
reconcile these different approaches to figurative language and the
lexicon - although papers exploring other aspects of figurative
language will also be welcome.

Deadline: 13 April 2003
Web site: http://www.cs.bham.ac.uk/~amw/ACLWorkshop.html

- -----
WS4 Multilingual and Mixed-language Named Entity Recognition:
Combining Statistical and Symbolic Models

Invited speaker: David Yarowsky

Named Entity (NE) Recognition systems vary widely, from high-speed
bulk methods optimized for indexing, to deep semantic parsers tuned
for specific domains.  Optimal ways to combine statistical and
symbolic models also vary, depending on applications and tasks.  Is it
possible to:

-maximize use of knowledge-rich resources (e.g. lexicons, NE grammars,
parsing) while permitting corpus-based training for domain or language?
-acquire and share resources (including lexicons and grammars) across
languages?
-balance performance speed with reasonable accuracy?
-use specific language patterns while permitting rapid transfer to
another language?
-minimize variability in results across language types?

We welcome research on combined models, in which these tradeoffs are
calculated in particular ways.  Demonstrations of implemented NE
systems are also welcome.

Submit papers by April 4 electronically in Word, PDF or PostScript
format.  Assign a filename based on the paper's title, transfer to
ftp://ftp.research.microsoft.com/incoming/josephp then email an
identification page with title, author(s), contact details, and
filename to molsen at microsoft.com

URL: http://research.microsoft.com/conferences/mulner-acl03/

- -----
WS5 Second International Workshop on Paraphrasing: Paraphrase
Acquisition and Applications

Abstract: Paraphrases, variant ways of conveying the same information,
are of interest because they present challenges for many NLP tasks,
such as MT, IR, QA, etc.  This workshop is open to investigation of
all aspects of paraphrase, with a particular focus on the automatic
acquisition of paraphrases from corpora, and on the development of a
standardized paraphrase framework or resource for use in applications.

URL: http://nlp.nagaokaut.ac.jp/IWP2003/
Deadline: 21 April 2003

- -----
WS6 Second Sighan Workshop on Chinese Language Processing
    (July 11-12, 2003)

Abstract: As more resources for Chinese NLP have become available to
the public recently, it is crucial to set up a platform that allows
easy comparison of different approaches to various NLP tasks. Sighan
is conducting a word-segmentation bakeoff before the
workshop. Researchers all over the world are welcome to
participate. As a part of this Sighan workshop, we are going to
release the bakeoff results, followed by the presentation of bakeoff
participants and the general discussions on future evaluations. A
second part of the workshop will consist of presentations of papers on
all aspects of Chinese language processing.

URL: the workshop: http://www.sighan.org/swclp2/
URL: the bakeoff:  http://www.sighan.org/bakeoff2003/
Deadline: the workshop submission deadline: March 10, 2003
Deadline: the word segmentation bakeoff:    April 22-25, 2003

- -----
WS7 Multiword Expressions:  Analysis, Acquisition and Treatment

The workshop will concentrate on the analysis, acquisition and
treatment of multiword expressions (MWEs), such as phrasal verbs
(e.g. "add up"), nominal compounds (e.g. "radar footprint"), and
institutionalized phrases (e.g. "salt and pepper").  In particular we
focus on addressing the problems that MWEs pose for natural language
processing applications.

URL: http://www.cl.cam.ac.uk/users/alk23/mwe/mwe.html
Submission Deadline: 05 April 2003

- -----
WS9 Workshop on Patent Corpus Processing

Abstract: The goal of this workshop is to foster research and
development of the technology for patent corpus processing, by
providing a forum in which researchers and practitioners can exchange
and share their ideas, approaches, perspectives, and experiences from
their work in progress. We invite both research papers and project
papers associated with, but not limited to, the rudiments of patent
corpus processing. We also invite papers addressing applications and
user studies.

Deadline: 10 April 2003

============================================================

---------------------------------------------------------------------------
LINGUIST List: Vol-14-686



More information about the LINGUIST mailing list