26.1380, Calls: Text/Corpus Linguistics/UK
linguist at listserv.linguistlist.org
Thu Mar 12 03:47:17 UTC 2015
LINGUIST List: Vol-26-1380. Wed Mar 11 2015. ISSN: 1069 - 4875.
Subject: 26.1380, Calls: Text/Corpus Linguistics/UK
Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry, Sara Couture)
Homepage: http://linguistlist.org
************* LINGUIST List 2015 Fund Drive *************
Please support the LL editors and operation with a donation at:
Editor for this issue: Erin Arnold <earnold at linguistlist.org>
Date: Wed, 11 Mar 2015 23:45:49
From: Piotr Banski [banski at ids-mannheim.de]
Subject: 3rd Meeting of the Workshop on Challenges in the Management of Large Corpora
Full Title: 3rd Meeting of the Workshop on Challenges in the Management of Large Corpora
Short Title: CMLC-3
Date: 20-Jul-2015 - 20-Jul-2015
Location: Lancaster, United Kingdom
Contact Person: Piotr Banski
Meeting Email: banski at ids-NOSPAMmannheim.de
Web Site: http://corpora.ids-mannheim.de/cmlc.html
Linguistic Field(s): Text/Corpus Linguistics
Call Deadline: 22-Mar-2015
Meeting Description:
This half-day workshop will gather the leading researchers in the field of
Language Resource creation and Corpus Linguistics, in order to provide a
platform for an intensive exchange of expertise, results and ideas, concerning
topics revolving around the maintenance, curation, development and efficient
use of large, structured, annotated corpus resources.
2nd Call for Papers
The third edition of CMLC will accompany Corpus Linguistics 2015 in Lancaster, and will be held on the 20 July 2015. This half-day workshop will gather the leading researchers in the field of Language Resource creation and Corpus Linguistics, in order to provide a platform for an intensive exchange of expertise, results and ideas, in particular concerning the following topics:
- Recent developments in ongoing web-as-corpus initiatives, national corpora, reference corpora, and other very large corpora
- Evaluation and investigation of the properties of large corpora
- Extraction, representation, and management of metadata
- Virtualization / techniques for drawing and accessing stratified virtual corpora
- Increasing the coverage of underrepresented strata
- Legal issues including license models and license management
- Acquisition and curation of large text archives from third parties
- Legal and technological issues of corpora physically distributed over different locations
- System- and database architectures for very large semi-structured data sets
- Heavily annotated corpora
- Use of annotation standards for large data sets
- Issues of interoperability and tool chaining
- Interfaces for user-provided annotations
- Quality control of annotations in large data sets
- Dealing with efficient and scalable user interfaces
- Effective querying of large corpora with multiple annotation layers
- Effective techniques for analyzing corpus data
- Strategies and techniques for maximizing recall and coping with large numbers of false positives
- Visualization and other techniques that facilitate the linking between quantitative investigations and qualitative interpretations
- “Put the computation near the data” as a strategy for dealing with IPR restrictions
- Open-source software and open-data corpora strategies
- Other issues that arise in the context of management of large datasets
We invite extended abstracts (up to 4 pages standard size, references excluded, exclusively as PDF) addressing some of the topics listed above.
Submission deadline: 22 March, midnight GMT
Submission address: http://linguistlist.org/easyabs/cmlc-2015
A volume of proceedings is planned.
The home page of CMLC events is located at http://corpora.ids-mannheim.de/cmlc.html
Organizing Committee:
- Piotr Bański, Marc Kupietz, Harald Lüngen, Andreas Witt (Institut für Deutsche Sprache, Mannheim)
- Hanno Biber, Evelyn Breiteneder (Institute for Corpus Linguistics and Text Technology, Vienna)
Programme Committee:
- Damir Ćavar (Indiana University, Bloomington)
- Isabella Chiari (Sapienza University of Rome)
- Dan Cristea (''Alexandru Ioan Cuza'' University of Iasi)
- Václav Cvrček (Charles University Prague)
- Mark Davies (Brigham Young University)
- Tomaž Erjavec (Jožef Stefan Institute)
- Alexander Geyken (Berlin-Brandenburgische Akademie der Wissenschaften)
- Andrew Hardie (Lancaster University)
- Serge Heiden (ENS de Lyon)
- Nancy Ide (Vassar College)
- Miloš Jakubíček (Lexical Computing Ltd.)
- Adam Kilgarriff (Lexical Computing Ltd.)
- Krister Lindén (University of Helsinki)
- Martin Mueller (Northwestern University)
- Nelleke Oostdijk (Radboud University Nijmegen)
- Christian-Emil Smith Ore (University of Oslo)
- Piotr Pęzik (University of Łódź)
- Uwe Quasthoff (Leipzig University)
- Paul Rayson (Lancaster University)
- Laurent Romary (INRIA, DARIAH)
- Roland Schäfer (FU Berlin)
- Serge Sharoff (University of Leeds)
- Mária Simková (Slovak Academy of Sciences)
- Jörg Tiedemann (Uppsala University)
- Dan Tufiş (Romanian Academy, Bucharest)
- Tamás Váradi (Research Institute for Linguistics, Hungarian Academy of Sciences)
Please see http://corpora.ids-mannheim.de/cmlc.html for more information and the general updates.
LINGUIST List: Vol-26-1380
More information about the LINGUIST
mailing list