21.1661, Calls: Computational Ling/Sweden

Tue Apr 6 15:49:01 UTC 2010

LINGUIST List: Vol-21-1661. Tue Apr 06 2010. ISSN: 1068 - 4875.

Subject: 21.1661, Calls: Computational Ling/Sweden

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Monica Macaulay, U of Wisconsin-Madison  
Eric Raimy, U of Wisconsin-Madison  
Joseph Salmons, U of Wisconsin-Madison  
Anja Wanner, U of Wisconsin-Madison  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Kate Wu <kate at linguistlist.org>
================================================================  

LINGUIST is pleased to announce the launch of an exciting new feature:  
Easy Abstracts! Easy Abs is a free abstract submission and review facility 
designed to help conference organizers and reviewers accept and process 
abstracts online.  Just go to: http://www.linguistlist.org/confcustom, 
and begin your conference customization process today! With Easy Abstracts, 
submission and review will be as easy as 1-2-3!

===========================Directory==============================  

1)
Date: 05-Apr-2010
From: Barbara Plank < b.plank at rug.nl >
Subject: ACL 2010 Workshop on Domain Adaptation for Natural Language Processing

-------------------------Message 1 ---------------------------------- 
Date: Tue, 06 Apr 2010 11:47:10
From: Barbara Plank [b.plank at rug.nl]
Subject: ACL 2010 Workshop on Domain Adaptation for Natural Language Processing

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=21-1661.html&submissionid=2623742&topicid=3&msgnumber=1

Full Title: ACL 2010 Workshop on Domain Adaptation for Natural Language Processing 
Short Title: DANLP 

Date: 15-Jul-2010 - 15-Jul-2010
Location: Uppsala, Sweden 
Contact Person: Barbara Plank
Meeting Email: b.plank at rug.nl
Web Site: http://sites.google.com/site/danlp2010/ 

Linguistic Field(s): Computational Linguistics 

Call Deadline: 11-Apr-2010 

Meeting Description:

ACL 2010 Workshop on Domain Adaptation for Natural Language Processing (DANLP 2010)
http://sites.google.com/site/danlp2010/
July 15, 2010, Uppsala, Sweden 

Call for Papers

Deadline Extended to April 11, 2009 23:59 CET

Most modern Natural Language Processing (NLP) systems are subject to
the well known problem of lack of portability to new domains/genres: there is a
substantial drop in their performance when tested on data from a new domain,
i.e., their test data is drawn from a related but different distribution as
their training data. This problem is inherent in the assumption of independent
and identically distributed (i.i.d.) variables for machine learning systems, but
has started to get attention only in recent years. The need for domain
adaptation arises in almost all NLP tasks: part-of-speech tagging, semantic role
labeling, statistical parsing and statistical machine translation, to name but a
few.

Studies on supervised domain adaptation (where there are limited amounts of
annotated resources in the new domain) have shown that baselines comprising of
very simple models (e.g. models based only on source-domain data, only
target-domain data, or the union of the two) achieve relatively high performance
and are "surprisingly difficult to beat" (Daume III, 2007). Thus, one conclusion
from that line of work is that as long as there is a reasonable (often even
small) amount of labeled target data, it is often more fruitful to just use that.

In contrast, semi-supervised adaptation (i.e., no annotated resources in the new
domain) is a much more realistic situation but is clearly also considerably more
difficult.  Current studies on semi-supervised approaches show very mixed
results. For example, Structural Correspondence Learning (Blitzer et al., 2006)
was applied successfully to classification tasks, while only modest gains could
be obtained for structured output tasks like parsing. Many questions thus remain
open.

The goal of this workshop is to provide a meeting-point for research that
approaches the problem of adaptation from the varied perspectives of
machine-learning and a variety of NLP tasks such as parsing,
machine-translation, word sense disambiguation, etc.  We believe there is much
to gain by treating domain-adaptation as a general learning strategy that
utilizes prior knowledge of a specific or a general domain in learning about a
new domain; here the notion of a 'domain' could be as varied as child language
versus adult-language, or the source-side re-ordering of words to target-side
word-order in a statistical machine translation system.

Sharing insights, methodologies and successes across tasks will thus contribute
towards a better understanding of this problem. For instance, self-training the
Charniak parser alone was not effective for adaptation (it has been common
wisdom that self-training is generally not effective), but self-training with a
reranker was surprisingly highly effective (McClosky et al., 2006). Is this an
insight into adaptation that can be used elsewhere?  We believe that the key to
future success will be to exploit large collections of unlabeled data in
addition to labeled data. Not only because unlabeled
data is easier to obtain, but existing labeled resources are often not even
close to the envisioned target application domain. Directly related is the
question of how to measure closeness (or differences) among domains.

Workshop Topics
We especially encourage submissions on semi-supervised approaches of
domain adaptation with a deep analysis of models, data and results,
although we do not exclude papers on supervised adaptation. In particular, we
welcome submissions that address any of the following topics or other relevant
issues:

-  Algorithms for semi-supervised DA
-  Active learning for DA
-  Integration of expert/prior knowledge about new domains
-  DA in specific applications (e.g., Parsing, MT, IE, QA, IR, WSD)
-  Automatic domain identification and model adjustment
-  Porting algorithms developed for one type of problem structure to another
(e.g. from binary classification to structured-prediction problems)
-  Analysis and negative results: in-depth analysis of results, i.e. which model
parts/parameters are responsible for successful adaptation; what can we learn
from negative results (impact of negative experimental results on learning 
strategies/parameters)
-  A complementary perspective: (Better) generalization of ML models, i.e. to
make NLP models more broad-coverage and domain-independent, rather than
domain-specific
-  Learning from multiple domains

Submission
Papers should be submitted via the ACL submission system:
https://www.softconf.com/acl2010/DANLP/

All submissions are limited to 6 pages (including references) and
should be formatted using the ACL 2010 style file that can be found at:
http://acl2010.org/authors.html.

As the reviewing will be blind, papers must not include the authors'
names and affiliations.  Submissions should be in English and should
not have been published previously. If essentially identical papers
are submitted to other conferences or workshops as well, this fact
must be indicated at submission time.

The extended submission deadline is 23:59 CET on April 11, 2010 (Sunday).

Important Dates
April 11, 2010: Submission deadline
May 11, 2010: Notification of acceptance
May 21, 2010: Camera-ready papers due
July 15, 2010: Workshop

Invited speaker
John Blitzer, University of California, United States

Organization
Hal Daumé III, University of Utah, USA
Tejaswini Deoskar, University of Amsterdam, The Netherlands
David McClosky, Stanford University, USA
Barbara Plank, University of Groningen, The Netherlands
Jörg Tiedemann, Uppsala University, Sweden

Program Committee
Eneko Agirre, University of the Basque Country, Spain
John Blitzer, University of California, United States
Walter Daelemans, University of Antwerp, Belgium
Mark Dredze, Johns Hopkins University, United States
Kevin Duh, NTT Communication Science Laboratories, Japan (formerly University 
of Washington, Seattle)
Philipp Koehn, University of Edinburgh, United Kingdom
Jing Jiang, Singapore Management University, Singapore
Oier Lopez de Lacalle, University of the Basque Country, Spain
Robert Malouf, San Diego State University, United States
Ray Mooney, University Texas, United States
Hwee Tou Ng, National University of Singapore, Singapore
Khalil Sima'an, University of Amsterdam, The Netherlands
Michel Simard, National Research Council of Canada, Canada
Jun'ichi Tsujii, University of Tokyo, Japan
Antal van den Bosch, Tilburg University, The Netherlands
Josef van Genabith, Dublin City University, Ireland
Yi Zhang, German Research Centre for Artificial Intelligence (DFKI GmbH) and
Saarland University, Germany

Sponsor
This workshop is kindly supported by the Stevin project PaCo-MT (Parse
and Corpus-based Machine Translation) .

Contact
Email: danlp.acl2010 at gmail.com
Website: http://sites.google.com/site/danlp2010/

-----------------------------------------------------------
This Year the LINGUIST List hopes to raise $65,000. This money will go to help 
keep the List running by supporting all of our Student Editors for the coming year.

See below for donation instructions, and don't forget to check out our Space Fund 
Drive 2010 and join us for a great journey!

http://linguistlist.org/fund-drive/2010/

There are many ways to donate to LINGUIST!

You can donate right now using our secure credit card form at  
https://linguistlist.org/donation/donate/donate1.cfm

Alternatively you can also pledge right now and pay later. To do so, go to: 
https://linguistlist.org/donation/pledge/pledge1.cfm

For all information on donating and pledging, including information on how to 
donate by check, money order, or wire transfer, please visit: 
http://linguistlist.org/donation/

The LINGUIST List is under the umbrella of Eastern Michigan University and as 
such can receive donations through the EMU Foundation, which is a registered 
501(c) Non Profit organization. Our Federal Tax number is 38-6005986. These 
donations can be offset against your federal and sometimes your state tax return 
(U.S. tax payers only). For more information visit the IRS Web-Site, or contact 
your financial advisor.

Many companies also offer a gift matching program, such that they will match 
any gift you make to a non-profit organization. Normally this entails your 
contacting your human resources department and sending us a form that the 
EMU Foundation fills in and returns to your employer. This is generally a simple 
administrative procedure that doubles the value of your gift to LINGUIST, without 
costing you an extra penny. Please take a moment to check if your company 
operates such a program.

Thank you very much for your support of LINGUIST! 
-----------------------------------------------------------
LINGUIST List: Vol-21-1661