[Corpora-List] Second CfP : ICML/UAI/COLT 2008 Workshop on Prior Knowledge for Text and Language Processing
Hal Daume III
hal at cs.utah.edu
Mon Apr 28 15:31:55 UTC 2008
WORKSHOP: PRIOR KNOWLEDGE FOR TEXT AND LANGUAGE PROCESSING
9 July 2008, Helsinki, in conjunction with the ICML/UAI/COLT
CALL FOR PAPERS: *** NOTE THE EXTENDED DEADLINE ***
Abstract submission deadline: 7 May 2008 (extended from 30 April)
Notification to authors: 22 May 2008 (extended from 15 May)
Final version: 30 June 2008
Workshop: 9 July 2008
Web page: http://prior-knowledge-language-ws.wikidot.com (please monitor
this page for updates)
CONTEXT: The workshop is part of the Thematic Programme "Leveraging
Complex Prior Knowledge for Learning" of the PASCAL-2 European Network
of Excellence starting in March 2008.
GOALS: The aim of the workshop is to present and discuss recent advances
in machine learning approaches to text and natural language processing
that capitalize on rich prior knowledge models in these domains.
MOTIVATION: Traditionally, in Machine Learning, a strong focus has been
put on data-driven methods that assume little a priori knowledge on the
part of the learning mechanism. Such techniques have proven quite
effective not only for simple pattern recognition tasks, but also, more
surprisingly, for such tasks as language modeling in speech recognition
using basic n-gram models. However, when the structures to be learned
become more complex, even large training sets become sparse relative to
the task, and this sparsity can only be mitigated if some prior
knowledge comes into play to constrain the space of fitted models. We
currently see a strong emerging trend in the field of machine learning
for text and language processing to incorporate such prior knowledge for
instance in language modeling (e.g. through non-parametric Bayesian
priors) or in document modeling (e.g. through hierarchical graphical
models). There are complementary attempts in the field of statistical
computational linguistics (e.g in statistical machine translation) to
build hybrid systems that do not rely uniquely on corpus data, but also
exploit some form of a priori grammatical knowledge, bridging the gap
between purely data-oriented approaches and the traditional purely
rule-based approaches, that do not rely on automatic corpus training,
but only indirectly on human observations about linguistic data. The
domain of text and language processing thus appears as a very promising
field for studying the interactions between prior knowledge and raw
training data, and this workshop aims at providing a forum for
discussing recent theoretical and practical advances in this area.
TOPICS: The workshop aims at presenting a diversity of viewpoints on
prior knowledge for language and text processing. Discussion of the
following topics, techniques and issues is encouraged (non-limitative):
* Prior knowledge for language modeling, parsing, translation
* Topic modeling for document analysis and retrieval
* Parametric and non-parametric Bayesian models in NLP
* Graphical models embodying structural knowledge of texts
* Complex features/kernels that incorporate linguistic knowledge;
kernels built from generative models
* Limitations of purely data-driven learning techniques for text
and language applications; performance gains due to incorporation of
prior knowledge
* Typology of different forms of prior knowledge for NLP (knowledge
embodied in generative Bayesian models, in MDL models, in ILP/logical
models, in linguistic features, in representational frameworks, in
grammatical rules…)
* Formal principles for combining rule-based and data-based
approaches to NLP
* Linguistic science and cognitive models as sources of prior knowledge
FORMAT: The workshop will consist of a mix of submitted papers, invited
talks, and discussion/panels in which different viewpoints will be
emphasized.
CALL FOR PAPERS: Researchers interested in presenting their work at the
workshop should send an email (preferably plain text or pdf attachment)
to ws_pktlp at xrce.xerox.com with the following information:
TITLE
AUTHORS
ABSTRACT (corresponding to approximately two plain text pages)
Note: In case you experience problem with the above email alias, please
contact: marc (dot) dymetman (at) xrce (dot) xerox (dot) com
We expect speakers to provide a final version of their paper before end
of June for inclusion on the workshop home page, and authors will be
encouraged to read the included papers prior to the meeting. A compiled
set of papers will be distributed as working notes at the workshop.
DATES:
Abstract submission deadline: 7 May 2008 (extended from 30 April)
Notification to authors: 22 May 2008 (extended from 15 May)
Final version: 30 June 2008
Workshop: 9 July 2008
INVITED PRESENTATIONS AND PANELISTS (partial list):
* David Blei
* Pedro Domingos
* Mark Johnson
* Dan Melamed
* Massimiliano Pontil
PROGRAM COMMITTEE
* Guillaume Bouchard
* Nicola Cancedda
* Hal Daumé III
* Marc Dymetman
* Tom Griffiths
* Peter Grünwald
* Kevin Knight
* Marc Johnson
* Yee Whye Teh
ORGANIZERS:
* Guillaume Bouchard: guillaume (dot) bouchard (at) xrce (dot)
xerox (dot) com
* Hal Daumé III: hal (at) cs (dot) utah (dot) edu
* Marc Dymetman (main contact): marc (dot) dymetman (at) xrce (dot)
xerox (dot) com
* Yee Whye Teh: yeewhye (at) gmail (dot) com
--
Hal Daume III --- me AT hal3 DOT name | http://www DOT hal3 DOT name
"Arrest this man, he talks in maths." | http://nlpers.blogspot.com
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list