[Corpora-List] Corpus Analysis with Noise in the Signal workshop (CANS 2013)

Alistair Baron a.baron at comp.lancs.ac.uk
Fri May 10 19:20:42 UTC 2013


Call for participation.


Dear list members,

We invite any of you interested in the application of corpus linguistics
methods to datasets which contain substantial "noise" (e.g. spelling
variation in historical, learner and web corpora) to join us in *Lancaster*on
*July 22nd* for the* Corpus Analysis with Noise in the Signal workshop
(CANS 2013)*.

The workshop will offer participants the chance to gain insights into the
characteristics of the noise in different language varieties, the effect of
the noise on different corpus linguistic techniques and different methods
to either negate the noise or to produce more robust tools that can
accurately process noisy textual data. A wide range of presentations will
be given, as the programme below shows.


+++++++++++++++++++++++++++++++++++++++

Turo Hiltunen & Jukka Tyrkkö
Tagging Early Modern English Medical Texts (1500-1700)

Marcel Bollmann
Spelling normalization of historical German with sparse training data

Felix Bildhauer & Roland Schäfer
Token-level noise in large Web corpora and nondestructive normalization for
linguistic applications

Elena Klyachko, Timofey Arkhangelskiy, Olesya Kisselev & Ekaterina
Rakhilina Automatic error detection in Russian learner language

Maura Ratia
Performing keyword analysis on automatically vs. manually VARDed corpora:
The case of Stuart plague treatises

Verena Möller
Retrieving passive structures from the Secondary-Level Corpus of Learner
English (SCooLE) - How can we make part-of-speech tagging more successful?

+++++++++++++++++++++++++++++++++++++++

There will also be software demonstrations and a round-table discussion of
the challenges associated with corpus analysis with "noisy" data, including
questions such as:

   - What are the key challenges of dealing with noisy textual data going
   forward?
   - When should we leave "noise" where it is? And for what reason(s)?
   - What are the dangers of ignoring the noise?


Further details of the workshop are available here:
http://ucrel.lancs.ac.uk/cans2013/

The workshop is held prior to the Corpus Linguistics 2013 conference at
Lancaster University. Details for registration for the workshop, and also
the conference itself are here: http://ucrel.lancs.ac.uk/cl2013/register.php

We look forward to seeing you in Lancaster.

Thanks and best wishes,

Alistair Baron,
Paul Rayson,
Dawn Archer
(CANS 2013 workshop organising committee)
cans2013 at comp.lancs.ac.uk
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130510/4ea13bb8/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list