[Corpora-List] Second Call for Papers: Web as Corpus at EACL 2006
Marco Baroni
baroni at sslmit.unibo.it
Thu Dec 8 20:36:58 UTC 2005
Apologies for cross- and re- posting...
********************************************************
Call for Papers:
2nd WEB AS CORPUS WORKSHOP
In conjunction with the 11th Conference of the European Chapter of the
Association for Computational Linguistics (EACL)
Trento, Italy
April 4, 2006
Workshop site:
http://sslmit.unibo.it/~baroni/web_as_corpus_eacl06.html
Submission form:
http://www.softconf.com/start/EACL06_WS01
Previous WaC Workshop:
http://sslmit.unibo.it/~baroni/web_as_corpus_cl05.html
Co-chairs: Adam Kilgarriff and Marco Baroni
Topics
------
Despite the fact that a growing body of work has shown that the World
Wide Web is a mine of language data of unprecedented richness and ease
of access (see, e.g., the papers collected in Kilgarriff and
Grefenstette, 2003), many fundamental issues about the viability and
exploitation of the Web as a linguistic corpus are just starting to be
tackled, ranging from Web frequency distributions and registers, to
efficient handling of massive data sets, to copyright. Research on the
Web as corpus is currently at a very exciting stage: increasing
evidence points to the enormous potential of the Internet as a source
of linguistic data, but we are still far from a working, fully-fledged
linguists' search engine.
We invite submissions which:
- describe Web corpus collection projects, or modules for one part of
the process (crawling, filtering, language-id, tokenizing,
lemmatizing, POS-tagging, indexing, ...)
- explore characteristics of Web data, from a linguistics/NLP
perspective
- use crawled Web data for NLP purposes.
Preference will be given to projects where Web data are downloaded and
processed directly, rather than being accessed via search engine counts.
Submission Information
----------------------
Authors are invited to submit full papers on original, unpublished
work in the topic area of this workshop. Submissions should follow the
two-column format of ACL proceedings and should not exceed eight (8)
pages, including references. We strongly recommend the use of ACL
LaTeX or Microsoft Word style files tailored for this year's
conference available at
http://eacl06.itc.it/submission/submission.htm
Papers must conform to the official EACL-06 style guidelines, and we
reserve the right to reject submissions that do not conform to these
styles, including font size restrictions. Submissions should be in PDF
format and must include all fonts, so that the paper will print (not
just view) anywhere.
Please submit your paper no later than January 6, 2006, using the online
submission form available at
http://www.softconf.com/start/EACL06_WS01
Each submission will be reviewed at least by two members of the
program committee. Accepted papers will be published in the workshop
proceedings.
Dual submissions to the main EACL 2006 conference and this workshop
are allowed; if you submit to the main session, do indicate this when
you submit to the workshop, and specify your EACL submission reference
number, for administrative ease. If your paper is accepted for the
main session, you should withdraw your paper from the workshop upon
notification by the main session.
Important Dates
---------------
January 6, 2006 - Deadline for workshop papers
January 27, 2006 - Notification of acceptance
February 10, 2006 - Camera-ready papers due
April 4, 2006 - Workshop
As the schedule is extremely tight, deadline extensions are NOT possible.
Program Committee
-----------------
Marco Baroni (co-chair)
Silvia Bernardini
Massimiliano Ciaramita
Stefan Evert
William H. Fletcher
Gregory Grefenstette
Frank Keller
Adam Kilgarriff (co-chair)
Mirella Lapata
Anke Lüdeling
Philip Resnik
Serge Sharoff
Contacts
--------
Adam Kilgarriff: adam at lexmastersclass.com
Marco Baroni: baroni at sslmit.unibo.it
Further Information
-------------------
Information on registration and registration fees will be provided at
the main conference site:
http://eacl06.itc.it/
The EACL 2006 Workshops site:
http://www.science.uva.nl/~mdr/EACL2006Workshops/
Notice in particular the related workshop on New Text: Wikis and blogs
and other dynamic text sources:
http://www.sics.se/jussi/newtext/
More information about the Corpora
mailing list