24.2946, Calls: Computational Ling, Text/Corpus Ling, General Ling/Germany

linguist at linguistlist.org linguist at linguistlist.org
Fri Jul 19 16:45:20 UTC 2013


LINGUIST List: Vol-24-2946. Fri Jul 19 2013. ISSN: 1069 - 4875.

Subject: 24.2946, Calls: Computational Ling, Text/Corpus Ling, General Ling/Germany

Moderator: Damir Cavar, Eastern Michigan U <damir at linguistlist.org>

Reviews: Veronika Drake, U of Wisconsin Madison
Monica Macaulay, U of Wisconsin Madison
Rajiv Rao, U of Wisconsin Madison
Joseph Salmons, U of Wisconsin Madison
Mateja Schuck, U of Wisconsin Madison
Anja Wanner, U of Wisconsin Madison
       <reviews at linguistlist.org>

Homepage: http://linguistlist.org

Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from
Amazon!

USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21

For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.

Editor for this issue: Bryn Hauk <bryn at linguistlist.org>
================================================================  

Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.linguistlist.org/
					
					

Date: Fri, 19 Jul 2013 12:44:58
From: Felix Bildhauer [felix.bildhauer at fu-berlin.de]
Subject: DGfS 2014 Workshop: Web Data as a Challenge for Theoretical Linguistics and Corpus Design

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=24-2946.html&submissionid=17858786&topicid=3&msgnumber=1
 
Full Title: DGfS 2014 Workshop: Web Data as a Challenge for Theoretical Linguistics and Corpus Design 
Short Title: WEBTL-2014 

Date: 05-Mar-2014 - 07-Mar-2014
Location: Marburg, Germany 
Contact Person: Felix Bildhauer
Meeting Email: webtl2014 at easychair.org
Web Site: http://hpsg.fu-berlin.de/cow/dgfs2014/ 

Linguistic Field(s): Computational Linguistics; General Linguistics; Text/Corpus Linguistics 

Call Deadline: 29-Jul-2013 

Meeting Description:

Web Data as a Challenge for Theoretical Linguistics and Corpus Design
Workshop at the 36th Annual Conference of the German Linguistic Society (March 5-7, 2014 at Marburg University, Marburg/Lahn, Germany)

Organizers:

Felix Bildhauer (Freie Universität Berlin/SFB632)
Roland Schäfer (Freie Universität Berlin)

Program Committee:

Chris Biemann
Stefan Evert
Matthias Hüning
Anke Lüdeling
Alexander Mehler
Uwe Quasthoff
Amir Zeldes
Torsten Zesch
Arne Zeschel

Aim of the Workshop:

The huge amounts of linguistic data on the web offer exciting new possibilities in empirically based theoretical linguistics. Web-derived linguistic resources can contain greater amounts of variation as well as non-standard grammar and writing compared to traditionally compiled corpora. Also, whole new registers and genres have been described to emerge on the web. Like spoken language - although clearly distinct from it - the language found on the web can thus challenge linguistic theories which are based mainly on standard written language as well as the categories assumed within these theories. At the same time, such non-standard features make the data harder to process for computational linguists, and additional care is required in making the decision of labeling material as ‘noise’, because it might be considered valuable data by some linguists.

This workshop aims to bring together researchers working in Theoretical Linguistics and Corpus Linguistics with those who create resources from web data. The primary question of the workshop is: Which new linguistic insights can we derive from web data? Secondarily, we ask how web data is (and how it should be) processed to produce easily accessible high-quality resources and thus facilitate this kind of innovative linguistic research.

Possible subjects for talks include (but are by no means restricted to):

- Theoretically motivated empirical studies of linguistic phenomena in web data
- Work on problems with established linguistic categories specific to certain types of web data (problems with traditional part-of-speech classification, syntactic categories, register and genre classification, etc.)
- Problems of working with web corpora from the user’s perspective in concrete studies (low quality of: tokenization, POS tagging, named entity recognition, etc.; availability and lack of metadata)
- Assessments and improvements of the quality of available and newly designed tools and models to process or classify web data
- Approaches to normalization of web data and evaluations of the acceptability of such normalizations from a linguistic perspective
- Sampling of web data (e.g., stratified vs. randomly compiled corpora, linguistic web characterization)

2nd Call for Papers:

We invite submissions for 30 minute talks (20 minutes plus 10 minutes of discussion) about completed or ongoing original research in which web data is used or which is about the creation and/or evaluation of web data resources. The scope of the workshop is neither restricted to resources of a specific size or nature nor to any specific language(s). Submitted abstracts will be reviewed anonymously by at least two reviewers. We hope to offer authors of accepted talks the opportunity to publish an extended version of their talk in a special issue of a peer-reviewed corpus linguistics journal.

Submission Details:

- Submitted abstracts for 30 minute presentations (20 minutes plus 10 minutes discussion) should be between 800 and 1,000 words long (excluding references and tables).
- Submissions must be anonymous. Please take care in removing information from the file which could reveal your identity.
- The language of all abstracts and the workshop is English.
- The only accepted file format for submission is PDF.
- Submission must be made via EasyChair (WEBTL-2014): https://www.easychair.org/conferences/?conf=webtl2014.
- Authors of accepted papers will be asked to provide a shorter 200 word abstract to be printed in the conference program as an MS Word or OpenDocument file.







----------------------------------------------------------
LINGUIST List: Vol-24-2946	
----------------------------------------------------------
Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.linguistlist.org/
					
					



More information about the LINGUIST mailing list