<html>
  <head>
    <meta http-equiv="content-type" content="text/html;
      charset=ISO-8859-1">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    [Apologies for multiple postings]<br>
    <br>
    <p style="padding-top: 0pt; " class="paragraph_style_1"><span
        class="style">Background and Motivation</span><span
        class="style_2"> </span><span class="style_3"><br>
      </span></p>
    <p class="paragraph_style_1"><span class="style_2">The Web 2.0 has
        transferred the authorship of contents from institutions to the
        people; the web has become a channel where users exchange,
        explain or write about their lives and interests, give opinions
        and rate others’ opinions. The so-called User Generated Content
        (UGC) in text form is a valuable resource that can be exploited
        for many purposes, such as cross-lingual information retrieval,
        opinion mining, enhanced web search,  social science analysis,
        intelligent advertising, and so on.<br>
      </span></p>
    <p class="paragraph_style_1"><span class="style_2">In order to mine
        the data from the Web 2.0 we first need to understand its
        contents. Analysis of UG content is challenging because of its
        casual language, with plenty of abbreviations, slang, domain
        specific terms and, compared to published edited text, with a
        higher rate of spelling and grammar errors. Standard NLP
        techniques, which are used to analyze text and provide formal
        representations of surface data, have been typically developed
        to deal with standard language and may not yield the expected
        results on UGC. For example, shortened or misspelled words,
        which are very frequent in the Web 2.0 informal style, increase
        the variability in the forms for expressing a single concept. <br>
      </span></p>
    <p class="paragraph_style_1"><span class="style_2">This workshop
        aims at providing a meeting point for researchers working in the
        processing of UGC in textual form in one way or another, as well
        as developers of UGC-based applications and technologies, both
        from industry and academia.<br>
      </span></p>
    <br>
    <p class="paragraph_style_1"><span class="style"><b>Topics of
          Interest</b><br>
      </span></p>
    <p class="paragraph_style_1"><span class="style_2">We are mainly
        interested in, but not restricted to, the following research
        questions:<br>
      </span></p>
    <p class="paragraph_style_3"><span class="style_2">● What
        characterises UGC? Linguistic and textual phenomena that
        distinguish UGC from standard written text, and may pose a
        challenge for NLP.<br>
      </span></p>
    <p class="paragraph_style_3"><span class="style_2">● Definition of
        norm, concept of error, deviation and variation in UGC.<br>
      </span></p>
    <p class="paragraph_style_3"><span class="style_2">● Criteria and
        standards for the annotation of evaluation corpora in UGC at
        various levels of linguistic analysis (form, part of speech,
        constituents, dependencies, speech acts, deviation types, etc.).<br>
      </span></p>
    <p class="paragraph_style_4"><span class="style_2">● How quality of
        text affects processing tasks (tokenization, POS tagging,
        chunking, parsing, named-entity detection, etc.) <br>
      </span></p>
    <p class="paragraph_style_4"><span class="style_2">● Architecture
        and software design for flexible adaptation of NLP processing
        pipelines to new domains (topic domains and text-genre domains)<br>
      </span></p>
    <p class="paragraph_style_3"><span class="style_2">● Text
        normalisation vs adaptation of processing tools:<br>
      </span></p>
    <p class="paragraph_style_5"><span class="style_2">○ Pros and cons<br>
      </span></p>
    <p class="paragraph_style_5"><span class="style_2">○ Task dependent?<br>
      </span></p>
    <p class="paragraph_style_5"><span class="style_2">○ Costs and
        benefits<br>
      </span></p>
    <p class="paragraph_style_6"><span class="style_2">○ Hybrid
        solutions<br>
      </span></p>
    <p class="paragraph_style_4"><span class="style_2">● Approaches to
        normalisation (text checking, ASR, MT techniques, etc.)<br>
      </span></p>
    <p class="paragraph_style_4"><span class="style_2">● Evaluation
        issues related to processing and normalising UGC</span><span
        class="style_5"><br>
      </span></p>
    <br>
    <p class="paragraph_style_1"><span class="style"><b>Intended
          Audience</b><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">The workshop aims
        at bringing together researchers and developers from academia
        and industry. In particular, perspectives from the following
        user groups are welcome:<br>
      </span></p>
    <p class="paragraph_style_1"><span class="style_2">- UGC-based
        application developers, from both research and industry<br>
      </span></p>
    <p class="paragraph_style_1"><span class="style_2">- Researchers
        from the NLP, IR and IE communities<br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">- Ph.D students
        interested or working in the processing of UGC<br>
      </span></p>
    <p class="paragraph_style_2"><span class="style"><b>Submissions</b><br>
      </span></p>
    ● Oral papers and posters should follow the main conference
    formatting requirements (<a
      href="http://www.lrec-conf.org/lrec2012/"
      title="http://www.lrec-conf.org/lrec2012/">http://www.lrec-conf.org/lrec2012/</a>).<br>
    <p class="paragraph_style_8">● To submit contributions, please
      follow the instructions at <a
        href="https://www.softconf.com/lrec2012/UGC2012/"
        title="https://www.softconf.com/lrec2012/UGC2012/">https://www.softconf.com/lrec2012/UGC2012/</a><br>
    </p>
    <p class="paragraph_style_8">● The contributions will undergo a
      double review by members of the programme committee. When
      submitting a paper from the START page, authors will be asked to
      provide essential information about resources (in a broad sense,
      i.e. also technologies, standards, evaluation kits, etc.) that
      have been used for the work described in the paper or are a new
      result of your research. For further information on this new
      initiative, please refer to:<br>
    </p>
    <p class="paragraph_style_8"><a
        href="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012"
        title="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012">http://www.lrec-conf.org/lrec2012/?LRE-Map-2012</a><br>
      <br>
    </p>
    <p class="paragraph_style_1"><b><span class="style">Important Dates</span><span
          class="style_2"> </span></b><span class="style_3"><br>
      </span></p>
    <p class="paragraph_style_10"><span class="style_2"></span>February
      15: Paper submission deadline<br>
    </p>
    <p class="paragraph_style_10">March 15: Acceptance notifications<br>
    </p>
    <p class="paragraph_style_10">March 30: Camera-ready papers<br>
    </p>
    <p class="paragraph_style_10">May 26: Afternoon Workshop at LREC<br>
    </p>
    <p class="paragraph_style_2"><span class="style_2"><br>
      </span></p>
    <p class="paragraph_style_1"><span class="style"><b>Organising
          Committee</b><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">Laura Alonso i
        Alemany, </span><span class="style_7">Universidad Nacional de
        Córdoba (Argentina)</span><span class="style_5"><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">Jordi Atserias, </span><span
        class="style_7">Yahoo! Research (Spain)</span><span
        class="style_5"><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">Toni Badia, </span><span
        class="style_7">Universitat Pompeu Fabra (Spain)</span><span
        class="style_5"><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">Maite Melero, </span><span
        class="style_7">Barcelona Media Innovation Center (Spain)</span><span
        class="style_5"><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_2">Martí Quixal, </span><span
        class="style_7">Barcelona Media Innovation Center (Spain)<br>
      </span></p>
    <p class="paragraph_style_2"><span class="style_5"><br>
      </span></p>
    <p class="paragraph_style_2"><span class="style"><b>Programme
          Committee</b><br>
      </span></p>
    <p class="paragraph_style_8">Rafael Banchs, Institute for Infocomm
      Research - A*Star (Singapore)<br>
    </p>
    <p class="paragraph_style_8">Steven Bedrick, Oregon Health &
      Science University<br>
    </p>
    <p class="paragraph_style_8">Joan Codina, Universitat Pompeu Fabra
      (Spain)<br>
    </p>
    <p class="paragraph_style_8">Louise-Amélie Cougnon, Université
      Catholique de Louvain, ILC, Cental, (Belgium)<br>
    </p>
    <p class="paragraph_style_8">Jennifer Foster, Dublin City University
      (Ireland)<br>
    </p>
    <p class="paragraph_style_8">Michael Gamon, Microsoft Research (USA)<br>
    </p>
    <p class="paragraph_style_8">Fei Liu, Bosch Research (USA)<br>
    </p>
    <p class="paragraph_style_8">Ulrike Pado, VICO
      Research&Consulting GmbH<br>
    </p>
    <p class="paragraph_style_8">Lluís Padró, Universitat Politècnica de
      Catalunya (Spain)<br>
    </p>
    <p class="paragraph_style_8">Alan Ritter, CSE, University of
      Washington  (USA)<br>
    </p>
    <p class="paragraph_style_8">Roser Saurí, Barcelona Media Innovation
      Center (Spain)<br>
    </p>
    <p class="paragraph_style_8">Paul Schmidt, Institut der Gesellschaft
      zur Förderung der Angewandten<br>
    </p>
    <p class="paragraph_style_8">Informationsforschung (Germany)<br>
    </p>
    <p class="paragraph_style_8">L Venkata Subramaniam, IBM Research
      (India)<br>
    </p>
    <p class="paragraph_style_11"><span class="style_8"><br>
      </span></p>
    <br>
  </body>
</html>