[Corpora-List] Call for Papers: Web as Corpus at EACL 2006

Marco Baroni baroni at sslmit.unibo.it
Wed Nov 2 12:54:02 UTC 2005


Apologies for cross-posting...


********************************************************

                                       Call for Papers:
                         2nd WEB AS CORPUS WORKSHOP

In conjunction with the 11th Conference of the European Chapter of the
Association for Computational Linguistics (EACL)

Trento, Italy
April 4, 2006


Workshop site:

http://sslmit.unibo.it/~baroni/web_as_corpus_eacl06.html


Previous WaC Workshop:

http://sslmit.unibo.it/~baroni/web_as_corpus_cl05.html


Co-chairs: Adam Kilgarriff and Marco Baroni



Topics
------

Despite the fact that a growing body of work has shown that the World
Wide Web is a mine of language data of unprecedented richness and ease
of access (see, e.g., the papers collected in Kilgarriff and
Grefenstette, 2003), many fundamental issues about the viability and
exploitation of the Web as a linguistic corpus are just starting to be
tackled, ranging from Web frequency distributions and registers, to
efficient handling of massive data sets, to copyright. Research on the
Web as corpus is currently at a very exciting stage: increasing
evidence points to the enormous potential of the Internet as a source
of linguistic data, but we are still far from a working, fully-fledged
linguists' search engine.

We invite submissions which:

- describe Web corpus collection projects, or modules for one part of
   the process (crawling, filtering, language-id, tokenizing,
   lemmatizing, POS-tagging, indexing, ...)

- explore characteristics of Web data, from a linguistics/NLP
   perspective

- use crawled Web data for NLP purposes.

Preference will be given to projects where Web data are downloaded and
processed directly, rather than via search engine interfaces.


Submission Information
----------------------

Authors are invited to submit full papers on original, unpublished
work in the topic area of this workshop. Submissions should follow the
two-column format of ACL proceedings and should not exceed eight (8)
pages, including references. We strongly recommend the use of ACL
LaTeX or Microsoft Word style files tailored for this year's
conference available at

http://eacl06.itc.it/submission/submission.htm

Papers must conform to the official EACL-06 style guidelines, and we
reserve the right to reject submissions that do not conform to these
styles, including font size restrictions. Submissions should be in PDF
format and must include all fonts, so that the paper will print (not
just view) anywhere.

Please submit your paper no later than January 6, 2006. Information on
the submission procedure will be posted on the workshop site as soon
as possible, and in any case well in advance of the submission
deadline.

Each submission will be reviewed at least by two members of the
program committee. Accepted papers will be published in the workshop
proceedings.

Dual submissions to the main EACL 2006 conference and this workshop
are allowed; if you submit to the main session, do indicate this when
you submit to the workshop, and specify your EACL submission reference
number, for administrative ease. If your paper is accepted for the
main session, you should withdraw your paper from the workshop upon
notification by the main session.


Important Dates
---------------

January 6, 2006 - Deadline for workshop papers

January 27, 2006 - Notification of acceptance

February 10, 2006 - Camera-ready papers due

April 4, 2006 - Workshop

As the schedule is extremely tight, deadline extensions are NOT possible.


Program Committee
-----------------

Marco Baroni (co-chair)
Silvia Bernardini
Massimiliano Ciaramita
Stefan Evert
William H. Fletcher
Gregory Grefenstette
Frank Keller
Adam Kilgarriff (co-chair)
Mirella Lapata
Anke Lüdeling
Philip Resnik
Serge Sharoff


Contacts
--------

Adam Kilgarriff: adam at lexmastersclass.com

Marco Baroni: baroni at sslmit.unibo.it


Further Information
-------------------

Information on registration and registration fees will be provided at
the main conference site:

http://eacl06.itc.it/

The EACL 2006 Workshops site:

http://www.science.uva.nl/~mdr/EACL2006Workshops/

Notice in particular the related workshop on New Text: Wikis and blogs
and other dynamic text sources:

http://www.sics.se/jussi/newtext/



More information about the Corpora mailing list