[Corpora-List] Wacky! Working Papers on the Web as Corpus

Marco Baroni baroni at sslmit.unibo.it
Tue Sep 26 12:42:45 UTC 2006


Dear All,

We are glad to announce that the book:

Wacky! Working Papers on the Web as Corpus

is freely available online from the address:

http://wackybook.sslmit.unibo.it/

Alternatively, a hard-copy of the book can be purchased from the publisher
(http://www.gedit.it/).

Details follow.

Best regards,

Marco Baroni and Silvia  Bernardini


****************************


Baroni, Marco and Bernardini, Silvia (eds). Wacky! Working Papers
on the Web as Corpus. Bologna: GEDIT. 2006. [ISBN: 88-6027-004-9]

The book collects articles deriving from presentations at two Web as Corpus
workshops (held in Forlì and Birmingham in 2005) and articles that were
born out of discussions and collaborative experimentation among the WaCky
community members. WaCky (for "Web as Corpus kool ynitiative") brings
together linguists who think the World Wide Web is a great resource for
their research, and that it would be even greater if it could be annotated
and interrogated in a more linguist-friendly way.

Topics covered in the book include practical experiences with the
construction and evaluation of Web corpora, methods to classify and
represent Web corpora, and applications to terminology. The introduction
provides an accessible account of the various steps and issues involved in
building very large Web corpora and making them available to the linguistic
community. English, Chinese and Japanese are among the studied languages.

Web corpora are undoubtedly a timely and important topic for the
corpus/computational linguistics community. This book is unique in that it
provides detailed technical discussion of the issues related to
constructing Web corpora, as well as examples of concrete applications to
terminology practice and teaching. As such, it should be of interest to a
wide audience of linguists, language technologists, language/translation
teachers and language professionals.


Table of Contents:

A WaCky Introduction
Silvia Bernardini, Marco Baroni and Stefan Evert

Experience Building a Large Corpus for Chinese Lexicon Construction
Thomas Emerson and John O'Neil

Creating General-Purpose Corpora Using Automated Search Engine Queries
Serge Sharoff

Evaluation of Japanese Web-Based Reference Corpora: Effects of Seed
Selection and Time Interval
Motoko Ueyama

Measuring Web Corpus Randomness: A Progress Report
Massimiliano Ciaramita and Marco Baroni

Using the Web as a Source of LSP Corpora in the Terminology Classroom
Sara Castagnoli

Specialized Corpora from the Web and Term Extraction for Simultaneous
Interpreters
Claudio Fantinuoli

The Net for the Graphs: Towards Webgenre Representation for Corpus
Linguistic Studies
Alexander Mehler and Rüdiger Gleim

****************************



More information about the Corpora mailing list