[Corpora-List] DEADLINE EXTENSION: LRT4HDA - LANGUAGE RESOURCES AND TECHNOLOGIES FOR PROCESSING AND LINKING HISTORICAL DOCUMENTS AND ARCHIVES

Thu Feb 6 20:59:20 UTC 2014

With apologies for cross-posting

********************************************************************************
DEADLINE EXTENSION 26 FEBRUARY 2014

FINAL CALL FOR PAPERS
LANGUAGE RESOURCES AND TECHNOLOGIES FOR PROCESSING AND LINKING  
HISTORICAL DOCUMENTS AND ARCHIVES ?LRT4HDA ?Deploying Linked Open Data  
in Cultural Heritage
Full-day Workshop organised in conjunction with the LREC 2014  
Conference (http://lrec2014.lrec-conf.org/en/)
26 May 2014 Reykjavík, Iceland
http://www.c-phil.uni-hamburg.de/view/Main/LTforHisLangArhives2014
- SPECIAL TRACK on Digital Acquisition and Analysis of Historical  
Newspaper Collections -
********************************************************************************

MOTIVATION:
Recently, collaboration between the NLP community and specialists in  
various areas of the Humanities has become more efficient and fruitful  
due to the common aim of exploring and preserving cultural heritage  
data. It is worth mentioning the efforts made during the digitisation  
campaigns in the last years and within a series of initiatives in the  
Digital Humanities, especially in making old manuscripts available  
through Digital Libraries.

Given the number of contemporary languages and their historical  
variants, it is practically impossible to develop brand new language  
resources and tools for processing older texts. Therefore, the real  
challenge is to adapt existing language resources and tools, as well  
as to provide (where necessary) training material in the form of  
corpora or lexicons for a certain period of time in history.

Another issue regarding historical documents is their usage after they  
are stored in digital libraries. Historical documents are not only  
browsed but together with adequate tools they may serve as basis for  
re-interpretation of historical facts, discovery of new connections,  
causal relations between events etc. In order to be able to make such  
analysis, historical documents should be linked among themselves, on  
the one hand, and with modern knowledge bases, on the other.  
Activities in the area of Linked Open Data (LOD) play a major role in  
this respect.

A particular type of  historical documents are the newspaper  
collections and archives. Newspapers reflect what is going on in  
society, and constitute a rich data collection for many types of  
humanities research, ranging from history, political and social  
sciences to linguistics, both synchronic and diachronic, and both  
national and cross-national. They represent an important resource for  
analysis of changes at all levels which emerged in Europe with begin  
of the industrialization period.

The aim of this workshop is to bring together researchers working in  
the interdisciplinary domain of cultural heritage, specialists in  
natural language and speech processing working with less-resourced  
languages as well as key players among Linked Open Data initiatives.  
They are expected to analyse problems and brainstorm solutions in the  
automatic analysis of historical documents, uni- or multimedia, their  
deep annotation and interlinking.

The workshop is organised in collaboration with CLARIN  (http://www.clarin.eu)

TOPICS OF INTEREST:
We are looking for contributions on original, unpublished work in the  
topic areas of the workshop, including (but not limited to) the  
following:
- Language  tools and resources for the analysis of older textual material;
- Adaptation of language technology tools developed for modern  
languages to their historical variants; transcription and  
transliteration problems and solutions;
- Named Entity Recognition for historical texts;
- Development of dedicated historical corpora and lexica as Linked Open Data;
- (Semi-) automatic extraction of content related metadata;
- Semantic linkage of heterogeneous data within digital libraries;
  - Linkage of historical documents with available Linked Open Data;
- Word sense disambiguation in old texts;
- Multilingual issues in historical texts;
- Applications concerning less resourced cultural heritage languages   
such as Old Norse,  early Arabic, Ottoman Turkish, Old Church  
Slavonic, older forms of Balkan languages.
A special track will be dedicated to the acquisition and analysis of  
historical newspaper archives. Submissions fort his special track  
should address topics related to following aspects:
Determining if, for a given language and period, digital (or  
digitised) newspapers exist at all. Access rights to digital newspaper  
collection for research purposes, and for publishing the results
- OCR solutions and limitations
- Extraction of articles from digitised archives
- Metadata annotation
- Showcases of successful humanities research projects based on  
digital or digitised newspapers
- Publishing, sharing and storing results and by-products
SUBMISSION DETAILS
Submissions should be made through the system of the main LREC  
conference. Papers describing completed work should be no longer than  
eight pages. Papers describing work in progress should be between four  
and six pages. We encourage in particular the demonstration of  
prototype systems, and papers including reference to an existing  
prototype will be offered the possibility to demonstrate their system  
in a particular session.

Papers should respect the LREC formatting guidelines. Papers will be  
reviewed by minimum 3 members of the Programme Committee.

When submitting a paper from the START page, authors will be asked to  
provide essential information about resources (in a broad sense, i.e.  
also technologies, standards, evaluation kits, etc.) that have been  
used for the work described in the paper or are a new result of your  
research. Moreover, ELRA encourages all LREC authors to share the  
described LRs (data, tools, services, etc.), to enable their reuse,  
replicability of experiments, including evaluation ones.

Submissions for the workshop can be done using the following link

http://www.softconf.com/lrec2014/LRT4HDA/

Papers dealing with processing of newspaper archives should be  
submitted to the "Newspaper" track all the other to "main track".

IMPORTANT DATES

Submission deadline (extended)  26 February 2014
Notification of acceptance: 19 March 2014
Final papers  due 30 March 2014

ORGANIZING COMMITTEE

Kristín Bjarnadóttir (The Arni Magnusson Institute for Icelandic Studies
Iceland),
Matthew Driscoll (Arnamagnean Institute, Copenhagen, Denmark),
Steven Krauwer (CLARIN ERIC, Netherlands)
Stelios Piperidis (ILSP, Athens, Greece),
Cristina Vertan (University of Hamburg, Germany)
Martin Wynne (Oxford, UK)

PROGRAMME COMMITTEE

Lars Borin. (University of Gothenburg, Sweden)
Rafael Carrasco (University of Alicante, Spain)
Paul Doorenbosch (National Library of the Netherlands, Netherlands)
Thorhallur Eythorsson (University of Iceland)
Alexander Geyken (BBAW, Germany
Günther Görz (University Erlangen, Germany)
Walther v. Hahn (University of Hamburg, Germany)
Erhard Hinrichs (University of Tuebingen, Germany)
Guillaume Jacquet (JRC, Italy)
Marc Kupietz (IDS, Germany)
Éric Laporte (Université Paris-Est Marne-la-Vallée, France)
Piroska Lendvai (Hungarian Academy of Sciences, Hungary)
Thierry Paquet (LITIS, France)
Gábor Prószéky (MorphoLogic, Hungary)
Bente Maegaard (University of Copenhagen, Denmark)
Christian Emil Ore (University of Oslo, Norway)
Eiríkur Rögnvaldsson, (University of Iceland)
Petya Osenova (IICT, Bulgarian Academy of Sciences, Bulgaria)
Manfred Thaller (Cologne University, Germany, Germany)
Tamás Váradi (Hungarian Academy of Sciences, Hungary)
Matthew Whelpton,  (University of Iceland.)
Kalliopi Zervanou (University of Tilburg, the Netherlands)

CONTACT:
Cristina Vertan (University of Hamburg)
cristina DOT vertan AT uni-hamburg.de
-- 
============================================================
Dr. Cristina Vertan
Arbeitsstelle "Computerphilologie"
und
AB. Natürlichsprachliche Systeme (NATS)
vogt-Kölln Strasse 30
22527 Hamburg

Raum/room F-534b
tel: +49 40 42883 2319
fax: +49 40 42883 2385
http://nats-www.informatik.uni-hamburg.de/CristinaVertan

============================================================

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora