[Corpora-List] CFP: IJCAI 2007 Workshop on Analytics for Noisy Unstructured Text Data

L V Subramaniam lvsubram at in.ibm.com
Tue Jun 20 13:33:29 UTC 2006


CALL FOR PAPERS

                                                AND 2007


                                    IJCAI 2007 Workshop on 
                   Analytics for Noisy Unstructured Text Data
                         8 January, 2006, Hyderabad, India

                         http://research.ihost.com/and2007
 
             held at 20th Int. Jt. Conf. on Artificial Intelligence 
                       (IJCAI 2007) http://www.ijcai-07.org/

              Deadline for Papers is September 25th 2006
 
WORKSHOP DESCRIPTION AND OBJECTIVES
Noisy unstructured text data is found in informal settings such as online 
chat, SMS, emails, message boards, newsgroups, blogs, wikis and web 
pages. Also, text produced by processing spontaneous speech, printed 
text, handwritten text contains processing noise. Text produced under 
such circumstances is typically highly noisy containing spelling errors, 
abbreviations, non-standard words, false starts, repetitions, missing 
punctuations, missing case information, pause filling words such as "um" 
and "uh." Such text can be seen in large amounts in contact centers, 
on-line 
chat rooms, OCRed text documents, SMS corpus etc. The theme of the IJCAI 
2007 Conference is "AI and its benefits to society." In keeping with this 
theme, 
this workshop proposes to look at text analytics of highly noisy text that 
is 
produced in such everyday applications in society.

The goal of the workshop is to focus on the problems encountered in 
analyzing 
such noisy documents coming from various sources. The nature of the text 
warrants moving beyond traditional text analytics techniques. We hope that 
the
workshop will allow researchers to present current research and 
development in
addressing this challenge. We also believe that as a result of this 
workshop 
there will be sharing of real life noisy data sets and will result in 
their 
becoming available to a wider research community.
 
TOPICS
We welcome original research papers that identify key problems related to 
noisy text analytics and offer solutions. We particularly encourage 
contributions that look at solving real life problems in the different 
settings where such data is produced in huge amounts. Potential topics 
include (but not limited to):
* NLP techniques for handling noisy unstructured data
* Characterization of the types of noise in documents
* Genre recognition based on the type of noise
* Robust parsing
* Characterizing, modeling and accounting for historical language change
* Methods for detecting and correcting spelling and grammatical errors in 
  noisy text
* Information Extraction and Retrieval from noisy text
* Automatic classification and clustering of imprecise documents 
* Noise-invariant document summarization techniques
* Issues in keyword search in presence of noise in unstructured data
* Machine Translation for noisy text
* Text analysis techniques for analysis and mining of call-logs, 
transcribed 
  calls, web logs, chat logs, email exchanges 
* Business Intelligence(BI) applications for contact centers that deal 
with 
  noisy data
* Surveys on aspects of text analytics for noisy unstructured data
 
PARTICIPATION
We hope that the workshop will allow researchers working in areas related 
to 
unstructured data analytics, Natural Language Processing, Information 
Extraction, Information Retrieval, etc., to focus on the needs of users 
extracting useful information from noisy text. The target audience is a 
mixture of academia and industry researchers working with noisy text. We 
believe this work is of direct relevance to domains such as call centers, 
the world-wide web, and government organizations that need to analyze huge 

amounts of noisy data.

IAPR ENDORSEMENT
This workshop is endorsed by the International Association for Pattern 
Recognition (http://www.iapr.org)

IMPORTANT DATES
Paper Submission: September 25th, 2006
Notification of Acceptance: October 23rd, 2006
Camera-Ready papers due: November 8th, 2006
Workshop at IJCAI 2007: January 8th, 2007
 
SUBMISSION REQUIREMENTS
We invite papers up to 8 pages in length in the style specified by IJCAI 
at
(pdf: http://www.ijcai-07.org/ijcai07_format.pdf, 
word: http://www.ijcai-07.org/ijcai07_format.dot, 
LaTeX: http://www.ijcai-07.org/ijcai07_format_latex.tar). 
 
Submissions should be made electronically to lvsubram at in.ibm.com and 
rshourya at in.ibm.com  before September 25th, 2006. 

PUBLICATION
We are currently in negotiation with a leading publisher for the 
proceedings 
to be available onsite. We are also arranging a journal special issue for
post-workshop publication of selected papers.
 
WORKSHOP CHAIRS
Craig Knoblock
University of Southern California
 
Daniel Lopresti
Lehigh University
 
Shourya Roy 
IBM Research, India Research Lab
 
L. Venkata Subramaniam 
IBM Research, India Research Lab

WORKSHOP CONTACTS
* L. V. Subramaniam lvsubram at in.ibm.com 
* Shourya Roy rshourya at in.ibm.com


Please visit the workshop website
***** http://research.ihost.com/and2007  *****
for information about participation and submitting papers.

For general information, please visit the IJCAI website
***** http://www.ijcai-07.org *****



More information about the Corpora mailing list