<html>

  <head>

    <meta http-equiv="content-type" content="text/html;

      charset=ISO-8859-1">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    [Apologies for multiple postings]<br>

    <br>

    <p style="padding-top: 0pt; " class="paragraph_style_1"><span

        class="style">Background and Motivation</span><span

        class="style_2"> </span><span class="style_3"><br>

      </span></p>

    <p class="paragraph_style_1"><span class="style_2">The Web 2.0 has

        transferred the authorship of contents from institutions to the

        people; the web has become a channel where users exchange,

        explain or write about their lives and interests, give opinions

        and rate others’ opinions. The so-called User Generated Content

        (UGC) in text form is a valuable resource that can be exploited

        for many purposes, such as cross-lingual information retrieval,

        opinion mining, enhanced web search,  social science analysis,

        intelligent advertising, and so on.<br>

      </span></p>

    <p class="paragraph_style_1"><span class="style_2">In order to mine

        the data from the Web 2.0 we first need to understand its

        contents. Analysis of UG content is challenging because of its

        casual language, with plenty of abbreviations, slang, domain

        specific terms and, compared to published edited text, with a

        higher rate of spelling and grammar errors. Standard NLP

        techniques, which are used to analyze text and provide formal

        representations of surface data, have been typically developed

        to deal with standard language and may not yield the expected

        results on UGC. For example, shortened or misspelled words,

        which are very frequent in the Web 2.0 informal style, increase

        the variability in the forms for expressing a single concept. <br>

      </span></p>

    <p class="paragraph_style_1"><span class="style_2">This workshop

        aims at providing a meeting point for researchers working in the

        processing of UGC in textual form in one way or another, as well

        as developers of UGC-based applications and technologies, both

        from industry and academia.<br>

      </span></p>

    <br>

    <p class="paragraph_style_1"><span class="style"><b>Topics of

          Interest</b><br>

      </span></p>

    <p class="paragraph_style_1"><span class="style_2">We are mainly

        interested in, but not restricted to, the following research

        questions:<br>

      </span></p>

    <p class="paragraph_style_3"><span class="style_2">● What

        characterises UGC? Linguistic and textual phenomena that

        distinguish UGC from standard written text, and may pose a

        challenge for NLP.<br>

      </span></p>

    <p class="paragraph_style_3"><span class="style_2">● Definition of

        norm, concept of error, deviation and variation in UGC.<br>

      </span></p>

    <p class="paragraph_style_3"><span class="style_2">● Criteria and

        standards for the annotation of evaluation corpora in UGC at

        various levels of linguistic analysis (form, part of speech,

        constituents, dependencies, speech acts, deviation types, etc.).<br>

      </span></p>

    <p class="paragraph_style_4"><span class="style_2">● How quality of

        text affects processing tasks (tokenization, POS tagging,

        chunking, parsing, named-entity detection, etc.) <br>

      </span></p>

    <p class="paragraph_style_4"><span class="style_2">● Architecture

        and software design for flexible adaptation of NLP processing

        pipelines to new domains (topic domains and text-genre domains)<br>

      </span></p>

    <p class="paragraph_style_3"><span class="style_2">● Text

        normalisation vs adaptation of processing tools:<br>

      </span></p>

    <p class="paragraph_style_5"><span class="style_2">○ Pros and cons<br>

      </span></p>

    <p class="paragraph_style_5"><span class="style_2">○ Task dependent?<br>

      </span></p>

    <p class="paragraph_style_5"><span class="style_2">○ Costs and

        benefits<br>

      </span></p>

    <p class="paragraph_style_6"><span class="style_2">○ Hybrid

        solutions<br>

      </span></p>

    <p class="paragraph_style_4"><span class="style_2">● Approaches to

        normalisation (text checking, ASR, MT techniques, etc.)<br>

      </span></p>

    <p class="paragraph_style_4"><span class="style_2">● Evaluation

        issues related to processing and normalising UGC</span><span

        class="style_5"><br>

      </span></p>

    <br>

    <p class="paragraph_style_1"><span class="style"><b>Intended

          Audience</b><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">The workshop aims

        at bringing together researchers and developers from academia

        and industry. In particular, perspectives from the following

        user groups are welcome:<br>

      </span></p>

    <p class="paragraph_style_1"><span class="style_2">- UGC-based

        application developers, from both research and industry<br>

      </span></p>

    <p class="paragraph_style_1"><span class="style_2">- Researchers

        from the NLP, IR and IE communities<br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">- Ph.D students

        interested or working in the processing of UGC<br>

      </span></p>

    <p class="paragraph_style_2"><span class="style"><b>Submissions</b><br>

      </span></p>

    ● Oral papers and posters should follow the main conference

    formatting requirements (<a

      href="http://www.lrec-conf.org/lrec2012/"

      title="http://www.lrec-conf.org/lrec2012/">http://www.lrec-conf.org/lrec2012/</a>).<br>

    <p class="paragraph_style_8">● To submit contributions, please

      follow the instructions at <a

        href="https://www.softconf.com/lrec2012/UGC2012/"

        title="https://www.softconf.com/lrec2012/UGC2012/">https://www.softconf.com/lrec2012/UGC2012/</a><br>

    </p>

    <p class="paragraph_style_8">● The contributions will undergo a

      double review by members of the programme committee. When

      submitting a paper from the START page, authors will be asked to

      provide essential information about resources (in a broad sense,

      i.e. also technologies, standards, evaluation kits, etc.) that

      have been used for the work described in the paper or are a new

      result of your research. For further information on this new

      initiative, please refer to:<br>

    </p>

    <p class="paragraph_style_8"><a

        href="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012"

        title="http://www.lrec-conf.org/lrec2012/?LRE-Map-2012">http://www.lrec-conf.org/lrec2012/?LRE-Map-2012</a><br>

      <br>

    </p>

    <p class="paragraph_style_1"><b><span class="style">Important Dates</span><span

          class="style_2"> </span></b><span class="style_3"><br>

      </span></p>

    <p class="paragraph_style_10"><span class="style_2"></span>February

      15: Paper submission deadline<br>

    </p>

    <p class="paragraph_style_10">March 15: Acceptance notifications<br>

    </p>

    <p class="paragraph_style_10">March 30: Camera-ready papers<br>

    </p>

    <p class="paragraph_style_10">May 26: Afternoon Workshop at LREC<br>

    </p>

    <p class="paragraph_style_2"><span class="style_2"><br>

      </span></p>

    <p class="paragraph_style_1"><span class="style"><b>Organising

          Committee</b><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">Laura Alonso i

        Alemany, </span><span class="style_7">Universidad Nacional de

        Córdoba (Argentina)</span><span class="style_5"><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">Jordi Atserias, </span><span

        class="style_7">Yahoo! Research (Spain)</span><span

        class="style_5"><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">Toni Badia, </span><span

        class="style_7">Universitat Pompeu Fabra (Spain)</span><span

        class="style_5"><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">Maite Melero, </span><span

        class="style_7">Barcelona Media Innovation Center (Spain)</span><span

        class="style_5"><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_2">Martí Quixal, </span><span

        class="style_7">Barcelona Media Innovation Center (Spain)<br>

      </span></p>

    <p class="paragraph_style_2"><span class="style_5"><br>

      </span></p>

    <p class="paragraph_style_2"><span class="style"><b>Programme

          Committee</b><br>

      </span></p>

    <p class="paragraph_style_8">Rafael Banchs, Institute for Infocomm

      Research - A*Star (Singapore)<br>

    </p>

    <p class="paragraph_style_8">Steven Bedrick, Oregon Health &

      Science University<br>

    </p>

    <p class="paragraph_style_8">Joan Codina, Universitat Pompeu Fabra

      (Spain)<br>

    </p>

    <p class="paragraph_style_8">Louise-Amélie Cougnon, Université

      Catholique de Louvain, ILC, Cental, (Belgium)<br>

    </p>

    <p class="paragraph_style_8">Jennifer Foster, Dublin City University

      (Ireland)<br>

    </p>

    <p class="paragraph_style_8">Michael Gamon, Microsoft Research (USA)<br>

    </p>

    <p class="paragraph_style_8">Fei Liu, Bosch Research (USA)<br>

    </p>

    <p class="paragraph_style_8">Ulrike Pado, VICO

      Research&Consulting GmbH<br>

    </p>

    <p class="paragraph_style_8">Lluís Padró, Universitat Politècnica de

      Catalunya (Spain)<br>

    </p>

    <p class="paragraph_style_8">Alan Ritter, CSE, University of

      Washington  (USA)<br>

    </p>

    <p class="paragraph_style_8">Roser Saurí, Barcelona Media Innovation

      Center (Spain)<br>

    </p>

    <p class="paragraph_style_8">Paul Schmidt, Institut der Gesellschaft

      zur Förderung der Angewandten<br>

    </p>

    <p class="paragraph_style_8">Informationsforschung (Germany)<br>

    </p>

    <p class="paragraph_style_8">L Venkata Subramaniam, IBM Research

      (India)<br>

    </p>

    <p class="paragraph_style_11"><span class="style_8"><br>

      </span></p>

    <br>

  </body>

</html>