29.4174, Calls: Comp Ling, Historical Ling, Text/Corpus Ling/Germany

The LINGUIST List linguist at listserv.linguistlist.org
Fri Oct 26 18:48:49 UTC 2018


LINGUIST List: Vol-29-4174. Fri Oct 26 2018. ISSN: 1069 - 4875.

Subject: 29.4174, Calls: Comp Ling, Historical Ling, Text/Corpus Ling/Germany

Moderator: linguist at linguistlist.org (Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté)
Homepage: https://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: Fri, 26 Oct 2018 14:48:07
From: Eva Zehentner [eva.zehentner at york.ac.uk]
Subject: (Semi-)automatic Retrieval of Data from Historical Corpora

 
Full Title: (Semi-)automatic Retrieval of Data from Historical Corpora 

Date: 21-Aug-2019 - 24-Aug-2019
Location: Leipzig, Germany 
Contact Person: Eva Zehentner
Meeting Email: retrievalSLE2019 at gmail.com

Linguistic Field(s): Computational Linguistics; Historical Linguistics; Text/Corpus Linguistics 

Language Family(ies): Germanic 

Call Deadline: 12-Nov-2018 

Meeting Description:

(Session of 52nd Annual Meeting of the Societas Linguistica Europaea)

Convenors: Marianne Hundt, Melanie Röthlisberger, Gerold Schneider and Eva
Zehentner

Developments in historical corpus linguistics have taken a similar route as in
corpus-based research on present-day languages: from the creation of small
reference corpora to increasingly larger databases and from text-only to
richly annotated resources. However, historical data have always posed
particular challenges for the development of corpus resources, their
annotation, and their analysis. Corpus representativeness and balancedness,
for instance, has been impaired by the limited availability of texts,
particularly for the very early stages of written attestation. Additionally,
the highly variable orthography typical of earlier texts has meant that the
tools developed for more uniform data cannot be applied in a straightforward
manner to historical corpora. In the case of smaller corpora, this has
resulted in grammatical annotation through manual annotation or post-editing 
For the increasingly larger resources, however, manual annotation is tedious,
and researchers have developed tools for pre-processing like spelling
normalisation (Baron and Rayson 2008) and lemmatisation (Burns 2013) to enable
automatic tagging and parsing. Matters are complicated further by the fact
that a range of different annotated resources exist (Penn Treebank, Penn
Parsed Corpora, Universal Dependency Treebanks) and different parsing tools
(e.g. Schneider 2012) have been applied to historical corpora, which are
likely to require different retrieval strategies, which in turn make
comparisons across corpora difficult. While the list of syntactic parsers is
large (e.g. Schneider (2008) for English, Sennrich et al. (2009) for German,
van Noord (2006) for Dutch, Alberti et al. 2017 for Universal Dependency
parsing), few have been used on, or adapted to historical texts.

The aim of this workshop is to focus on the challenges that (semi-)automatic
retrieval of data from historical corpora pose for the study of grammatical
change, specifically in English, German, and Dutch. In particular, we invite
contributions on topics such as (but not limited to) the following:

- mapping of different annotation schemes
- evaluation of bottom-up approaches to data retrieval for language change
- issues of precision and recall in historical corpora

Ultimately, this workshop seeks to provide a platform for researchers working
within these subject areas to exchange ideas and to jointly address the
challenges (and chances) we are faced with.


Call for Papers:

We invite researchers to submit an anonymised abstract of 300 words (excluding
references) to retrievalSLE2019 at gmail.com by November 12, 2018. Talks will be
20 minutes each, with 5 minutes for discussion and 5 minutes for speaker
change. The workshop  will start with  an introduction by the organisers, who
will summarise previous research, the research questions addressed in the
workshop and the scope of the papers to be presented. The workshop will be
concluded with a final discussion. 

The workshop proposal to be submitted to the SLE organisers will include all
participants’ abstracts. Notification of acceptance/rejection of the workshop
proposal by the SLE will be given by 15 December 2018. If our workshop
proposal is accepted, we will invite all preliminary workshop participants to
submit their full abstracts by 15 January 2017 to the general call for papers
for review.




------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:

              The IU Foundation Crowd Funding site:
       https://iufoundation.fundly.com/the-linguist-list

               The LINGUIST List FundDrive Page:
            https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-29-4174	
----------------------------------------------------------






More information about the LINGUIST mailing list