Appel: AND 2011

Thierry Hamon thierry.hamon at UNIV-PARIS13.FR
Wed Jun 8 20:53:40 UTC 2011


Date: Wed, 8 Jun 2011 09:16:44 -0400 (EDT)
From: Sebastián Peña Saldarriaga <spena at synchromedia.ca>
Message-ID: <19554746.15211307539004246.JavaMail.root at webmail.synchromedia.ca>
X-url: http://and2011.cse.lehigh.edu/


========================================================================
AND 2011
Fifth Workshop on
Analytics for Noisy Unstructured Text Data
In conjunction with
the 11th International Conference on Document Analysis and Recognition
(ICDAR)
September 17th, 2011
Beijing, China
http://and2011.cse.lehigh.edu/
========================================================================

Noisy unstructured text data is ubiquitous in real-world
communication. Natural language and the creative ways that humans use
it can create problems for computational techniques. Electronic text
from the Internet (emails, message boards, newsgroups, blogs, wikis,
chatlogs and web pages), contact centers (complaints, emails, call
transcriptions, message summaries), and mobile phones (SMS) is often
noisy - contains spelling errors, abbreviations, non-standard words,
false starts, repetitions, missing punctuation, missing case
information and special characters.

Informal communications are not the only source of noisy text; Text
produced by processing signals intended for human use such as
printed/handwritten documents, spontaneous speech, and camera-captured
scene images, are prime examples. Recognition errors made by Optical
Character Recognition (OCR) and Automatic Speech Recognition (ASR)
systems can result in imperfect transcriptions. An increasing stream
of imperfect OCR results are featured by ongoing mass-digitization of
the world's written cultural heritage.  Such noise in text has raised
new sets of challenges for the task of Information Retrieval and
Knowledge Management.

Handling noisy text poses new challenges for Information Extraction
(IE), Natural Language Processing (NLP), Information Retrieval (IR)
and Knowledge Management (KM). Special handling of noise as well as
noise-robust IR and KM techniques are essential to overcome these
challenges. As in the case of AND 07, 08, 09, and 10, we intend that
AND 2011 will provide researchers an opportunity to present their
latest results toward addressing these challenges. We seek papers
dealing with all aspects of noisy unstructured text data, its
processing and applications. We particularly encourage contributions
that look toward solving real life problems.

We welcome original research papers that identify key problems related
to noisy text analytics and offer solutions. Potential topics include
(but not limited to):

- Noise induced by document analysis techniques and its impact on
  downstream applications
- Formal theory on characterization of noise
- Genre recognition based on the type of noise
- Robust parsing and Part of Speech (POS) tagging
- Characterizing, modelling and accounting for historical language
  change
- Methods for detecting and correcting errors in noisy text
- Information extraction and retrieval from noisy text data
- Automatic classification and clustering of noisy unstructured data
- Noise-invariant document summarization techniques
- Issues in keyword search in presence of noise in unstructured data
- Machine Translation for noisy text
- Analyzing very short communications like those on Twitter
- Techniques for analysis and mining of call-logs, transcribed calls,
  web logs, chat logs, emails, tweets
- Business Intelligence (BI) applications dealing with noisy text data
- Surveys relating to noisy text analytics



Important Dates
--------------------
Abstract Submission: Extended to June 19th, 2011
Paper Submission: Extended to June 26th, 2011
Notification of Acceptance: July 25th, 2011
Camera-Ready papers due: August 8th, 2011


Organization
---------------------
Daniel Lopresti (Lehigh University)
Christoph Ringlstetter (University of Munich)
Shourya Roy (Xerox, India)
Lipika Dey, (TCS India)

-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list