Appel: 3rd Web as Corpus Workshop (WAC3)

Thierry Hamon thierry.hamon at LIPN.UNIV-PARIS13.FR
Tue Jan 23 18:30:13 UTC 2007


Date: Tue, 23 Jan 2007 17:24:32 +0100
From: Isabelle Lecroart <lecroart at tedm.ucl.ac.be>
Message-ID: <45B636C0.8030001 at tedm.ucl.ac.be>
X-url: http://cleaneval.sigwac.org.uk
X-url: http://sslmit.unibo.it/%7Ebaroni/web_as_corpus_cl05.html
X-url: http://sslmit.unibo.it/%7Ebaroni/web_as_corpus_eacl06.html
X-url: http://borel.slu.edu/crubadan/stadas.html
X-url: http://cleaneval.sigwac.org.uk
X-url: http://www.uclouvain.be/en-index.html
X-url: http://www.eupedia.com/belgium/louvain-la-neuve.shtml



      Call for papers

* 3rd Web as Corpus Workshop (WAC3)
incorporating Cleaneval
An ACL-SIGWAC event*

Sept. 15-16, 2007
University of Louvain, Louvain-la-Neuve, Belgium

More and more people are using Web data for linguistic and NLP
research.  The workshop provides a venue for exploring how we can use
it effectively and what we will find if we do.

We invite submissions which :

    * describe Web corpus collection projects, or modules for one part
      of the process (crawling, filtering, language-id, tokenising,
      lemmatising, POS-tagging, indexing, ...)

    * explore characteristics of Web data, from a linguistics/NLP
      perspective including registers, domains, frequency
      distributions

    * use crawled Web data for NLP purposes (with emphasis on the data
      rather than the use)


    Cleaneval

Anyone using web data needs to clean it, to get rid of unwanted
material including, for example, HTML markup, navigation bars,
advertisements. To date there has been no sharing of resources or
expertise and the cleaning has often been done minimally. Cleaneval is
an exercise to promote sharing and to improve our understanding of the
issues. It will take the now-familiar form of an open competition and
shared task. More info at Cleaneval <http://cleaneval.sigwac.org.uk>.


    Previous WAC workshops

More info at WAC1 
http://sslmit.unibo.it/%7Ebaroni/web_as_corpus_cl05.html at Corpus 
Linguistics conference, Birmingham, UK, July 2005.

More info at WAC2
http://sslmit.unibo.it/%7Ebaroni/web_as_corpus_eacl06.html at EACL,
Trento, Italy, April 2006.


    Invited speaker : Kevin Scannell

Kevin Scannell, of Saint Louis Univ., Missouri, USA, has been working
with scholars of a range of smaller languages to develop web corpora
for those languages : website
http://borel.slu.edu/crubadan/stadas.html currently lists 135
corpora/languages.


    Submission

For regular papers :
Papers (6-10 pages), demos (max. 2 pages) and posters (max. 2 pages)
to be written in English and follow ACL formatting. Template files
(.doc & Latex) available on the website.
For Cleaneval submissions see Cleaneval http://cleaneval.sigwac.org.uk
website.


            Deadline: 1 May 2007


    Venue

Université catholique de Louvain
http://www.uclouvain.be/en-index.html, in the elegant new city of
Louvain-la-Neuve http://www.eupedia.com/belgium/louvain-la-neuve.shtml
(Belgium). Large computer rooms will be available for demo sessions.


    Points of contact


        Worskshop Co-chairs

Cédrick Fairon, UCLouvain, Cental, fairon at tedm.ucl.ac.be
Prof. Gilles-Maurice de Schryver, Universiteit Gent


        Cleaneval committee

Marco Baroni, U Trento; Secretary, SIGWAC
Tony Hartley, U Leeds
Adam Kilgarriff, Lexical Computing Ltd; Chair, SIGWAC
Serge Sharoff, U Leeds


        Local organisation team

Bernadette Dehottay, UCLouvain, Cental, dehottay at tedm.ucl.ac.be
Julia Medori, CENTAL, UCLouvain
Laurent Kevers, CENTAL, UCLouvain
Hubert Naets, CENTAL, UCLouvain
Isabelle Lecroart, CENTAL, UCLouvain
Claude Devis, CENTAL, UCLouvain

Contact us :
Bernadette Dehottay
Université catholique de Louvain
Centre for Natural Language Processing (CENTAL)
Place Blaise Pascal, 1
1348 Louvain-la-Neuve
Tel. +32 10 47 37 88
Fax. +32 10 47 26 06
dehottay at tedm.ucl.ac.be


-------------------------------------------------------------------------
Message diffuse par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.atala.org/article.php3?id_article=48
English version       : 
Archives                 : http://listserv.linguistlist.org/archives/ln.html
                                http://liste.cines.fr/info/ln

La liste LN est parrainee par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhesion  : http://www.atala.org/
-------------------------------------------------------------------------



More information about the Ln mailing list