<html>
<head>
<meta http-equiv="content-type" content="text/html;
charset=ISO-8859-1">
</head>
<body bgcolor="#FFFFFF" text="#000000">
[Apologies for multiple postings]<br>
<br>
<p style="padding-top: 0pt; " class="paragraph_style_1"><span
class="style">Background and Motivation</span><span
class="style_2"> </span><span class="style_3"><br>
</span></p>
<p class="paragraph_style_1"><span class="style_2">The Web 2.0 has
transferred the authorship of contents from institutions to the
people; the web has become a channel where users exchange,
explain or write about their lives and interests, give opinions
and rate others’ opinions. The so-called User Generated Content
(UGC) in text form is a valuable resource that can be exploited
for many purposes, such as cross-lingual information retrieval,
opinion mining, enhanced web search, social science analysis,
intelligent advertising, and so on.<br>
</span></p>
<p class="paragraph_style_1"><span class="style_2">In order to mine
the data from the Web 2.0 we first need to understand its
contents. Analysis of UG content is challenging because of its
casual language, with plenty of abbreviations, slang, domain
specific terms and, compared to published edited text, with a
higher rate of spelling and grammar errors. Standard NLP
techniques, which are used to analyze text and provide formal
representations of surface data, have been typically developed
to deal with standard language and may not yield the expected
results on UGC. For example, shortened or misspelled words,
which are very frequent in the Web 2.0 informal style, increase
the variability in the forms for expressing a single concept. <br>
</span></p>
<p class="paragraph_style_1"><span class="style_2">This workshop
aims at providing a meeting point for researchers working in the
processing of UGC in textual form in one way or another, as well
as developers of UGC-based applications and technologies, both
from industry and academia.<br>
</span></p>
<br>
<p class="paragraph_style_1"><span class="style"><b>Topics of
Interest</b><br>
</span></p>
<p class="paragraph_style_1"><span class="style_2">We are mainly
interested in, but not restricted to, the following research
questions:<br>
</span></p>
<p class="paragraph_style_3"><span class="style_2">● What
characterises UGC? Linguistic and textual phenomena that
distinguish UGC from standard written text, and may pose a
challenge for NLP.<br>
</span></p>
<p class="paragraph_style_3"><span class="style_2">● Definition of
norm, concept of error, deviation and variation in UGC.<br>
</span></p>
<p class="paragraph_style_3"><span class="style_2">● Criteria and
standards for the annotation of evaluation corpora in UGC at
various levels of linguistic analysis (form, part of speech,
constituents, dependencies, speech acts, deviation types, etc.).<br>
</span></p>
<p class="paragraph_style_4"><span class="style_2">● How quality of
text affects processing tasks (tokenization, POS tagging,
chunking, parsing, named-entity detection, etc.) <br>
</span></p>
<p class="paragraph_style_4"><span class="style_2">● Architecture
and software design for flexible adaptation of NLP processing
pipelines to new domains (topic domains and text-genre domains)<br>
</span></p>
<p class="paragraph_style_3"><span class="style_2">● Text
normalisation vs adaptation of processing tools:<br>
</span></p>
<p class="paragraph_style_5"><span class="style_2">○ Pros and cons<br>
</span></p>
<p class="paragraph_style_5"><span class="style_2">○ Task dependent?<br>
</span></p>
<p class="paragraph_style_5"><span class="style_2">○ Costs and
benefits<br>
</span></p>
<p class="paragraph_style_6"><span class="style_2">○ Hybrid
solutions<br>
</span></p>
<p class="paragraph_style_4"><span class="style_2">● Approaches to
normalisation (text checking, ASR, MT techniques, etc.)<br>
</span></p>
<p class="paragraph_style_4"><span class="style_2">● Evaluation
issues related to processing and normalising UGC</span><span
class="style_5"><br>
</span></p>
<br>
<p class="paragraph_style_1"><span class="style"><b>Intended
Audience</b><br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">The workshop aims
at bringing together researchers and developers from academia
and industry. In particular, perspectives from the following
user groups are welcome:<br>
</span></p>
<p class="paragraph_style_1"><span class="style_2">- UGC-based
application developers, from both research and industry<br>
</span></p>
<p class="paragraph_style_1"><span class="style_2">- Researchers
from the NLP, IR and IE communities<br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">- Ph.D students
interested or working in the processing of UGC<br>
</span></p>
<p class="paragraph_style_2"><span class="style"><b>Submissions</b><br>
</span></p>
● Oral papers and posters should follow the main conference
formatting requirements (<a
href="http://www.lrec-conf.org/lrec2012/"
title="http://www.lrec-conf.org/lrec2012/">http://www.lrec-conf.org/lrec2012/</a>).<br>
<p class="paragraph_style_8">● To submit contributions, please
follow the instructions at <a
href="https://www.softconf.com/lrec2012/UGC2012/"
title="https://www.softconf.com/lrec2012/UGC2012/">https://www.softconf.com/lrec2012/UGC2012/</a><br>
</p>
<p class="paragraph_style_8">● The contributions will undergo a
double review by members of the programme committee. When
submitting a paper from the START page, authors will be asked to
provide essential information about resources (in a broad sense,
i.e. also technologies, standards, evaluation kits, etc.) that
have been used for the work described in the paper or are a new
result of your research. For further information on this new
initiative, please refer to:<br>
</p>
<p class="paragraph_style_8"><a
href="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012"
title="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012">http://www.lrec-conf.org/lrec2012/?LRE-Map-2012</a><br>
<br>
</p>
<p class="paragraph_style_1"><b><span class="style">Important Dates</span><span
class="style_2"> </span></b><span class="style_3"><br>
</span></p>
<p class="paragraph_style_10"><span class="style_2"></span>February
15: Paper submission deadline<br>
</p>
<p class="paragraph_style_10">March 15: Acceptance notifications<br>
</p>
<p class="paragraph_style_10">March 30: Camera-ready papers<br>
</p>
<p class="paragraph_style_10">May 26: Afternoon Workshop at LREC<br>
</p>
<p class="paragraph_style_2"><span class="style_2"><br>
</span></p>
<p class="paragraph_style_1"><span class="style"><b>Organising
Committee</b><br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">Laura Alonso i
Alemany, </span><span class="style_7">Universidad Nacional de
Córdoba (Argentina)</span><span class="style_5"><br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">Jordi Atserias, </span><span
class="style_7">Yahoo! Research (Spain)</span><span
class="style_5"><br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">Toni Badia, </span><span
class="style_7">Universitat Pompeu Fabra (Spain)</span><span
class="style_5"><br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">Maite Melero, </span><span
class="style_7">Barcelona Media Innovation Center (Spain)</span><span
class="style_5"><br>
</span></p>
<p class="paragraph_style_2"><span class="style_2">Martí Quixal, </span><span
class="style_7">Barcelona Media Innovation Center (Spain)<br>
</span></p>
<p class="paragraph_style_2"><span class="style_5"><br>
</span></p>
<p class="paragraph_style_2"><span class="style"><b>Programme
Committee</b><br>
</span></p>
<p class="paragraph_style_8">Rafael Banchs, Institute for Infocomm
Research - A*Star (Singapore)<br>
</p>
<p class="paragraph_style_8">Steven Bedrick, Oregon Health &
Science University<br>
</p>
<p class="paragraph_style_8">Joan Codina, Universitat Pompeu Fabra
(Spain)<br>
</p>
<p class="paragraph_style_8">Louise-Amélie Cougnon, Université
Catholique de Louvain, ILC, Cental, (Belgium)<br>
</p>
<p class="paragraph_style_8">Jennifer Foster, Dublin City University
(Ireland)<br>
</p>
<p class="paragraph_style_8">Michael Gamon, Microsoft Research (USA)<br>
</p>
<p class="paragraph_style_8">Fei Liu, Bosch Research (USA)<br>
</p>
<p class="paragraph_style_8">Ulrike Pado, VICO
Research&Consulting GmbH<br>
</p>
<p class="paragraph_style_8">Lluís Padró, Universitat Politècnica de
Catalunya (Spain)<br>
</p>
<p class="paragraph_style_8">Alan Ritter, CSE, University of
Washington (USA)<br>
</p>
<p class="paragraph_style_8">Roser Saurí, Barcelona Media Innovation
Center (Spain)<br>
</p>
<p class="paragraph_style_8">Paul Schmidt, Institut der Gesellschaft
zur Förderung der Angewandten<br>
</p>
<p class="paragraph_style_8">Informationsforschung (Germany)<br>
</p>
<p class="paragraph_style_8">L Venkata Subramaniam, IBM Research
(India)<br>
</p>
<p class="paragraph_style_11"><span class="style_8"><br>
</span></p>
<br>
</body>
</html>