[Corpora-List] Call fo Abstracts: Multilinguality in historical documents ? challenges and solutions for digital humanities (MHist) -Worhshop at DH2014
Cristina Vertan
cristina.vertan at uni-hamburg.de
Tue Apr 29 15:12:56 UTC 2014
With apologies for cross-posting
**********************************************************************************
CALL FOR PAPERS
Multilinguality in historical documents ? challenges and solutions for
digital humanities (MHist)
Full-day Workshop organised in conjunction with the Digital Humanities
2014 Conference (http://dh2014.org)
7 July 2014 Lausanne, Switzerland
http://www.linguistics.ruhr-uni-bochum.de/MHist/
Workshop endorsed by the ACL-SIGHUM Special Interest Group on Language
Technologies for the Socio-Economic Sciences and Humanities
(http://sighum.science.ru.nl/)
********************************************************************************
MOTIVATION
Recently, the collaboration between the Language Technology community
and the specialists in various areas of the Humanities has become more
efficient and fruitful due to the common aim of exploring and
preserving cultural heritage data. It is worth mentioning the efforts
made during the digitisation campaigns in the last years and within a
series of initiatives in the Digital Humanities, especially in making
old manuscripts and prints available in the form of Digital Libraries.
The availability of old texts on-line produced a revolutionary shift
in the way how such objects are analysed. They are no longer
restricted to a small number of specialists, knowing the language of
the document but to broader groups with various requirements:
1. non-expert users who would like to know what the document is about,
understand the main topics, localise places, persons. These users have
no or very little knowledge of old languages, and usually are less
familiarised with toponyms (especially when these belong to
geographical spaces unknown to the user);
2. researchers of neighbor fields, who often have only minimal
knowledge of the language but considerable knowledge of the historical
context and might be familiarised with historical toponyms and proper
names;
3. students and researchers specialising in historical data, who have
the required language skills but still can profit from additional
information accompanying the texts.
These considerations imply that the storage and visualisation of old
texts should be accompanied by a collection of tools empowering the
text with suitable information and making it understandable for
different user groups. Such tools usually involve automatic language
processing methods. In contrast to processing of modern texts, for
which language technology made a huge progress in the last years,
automatic processing of old texts is still problematic mainly because:
- Historical language data is sparse. First, compared to the
wealth of documents written in modern languages, there are only few
documents available for historical languages. Second, transcribing old
manuscripts often requires expert knowledge. Third, due to the absence
of a standard language, historical language variants differ in
spelling, morphology, syntax, and lexical semantics from each other.
- Texts are often multilingual, consisting of mixtures of different
languages, such as single words or phrases or entire sentences written
in Latin that are intermixed with passages written in the actual
language of the text. In case of texts from areas with rich cultural
mixtures (e.g. Balkans), one can find in addition paragraphs in
?exotic? local languages.
The focus of this workshop is on the second aspect. We think that the
challenges posed by multilinguality should be tackled by adapting
existing multilingual language resources and tools, and, where
necessary, by providing training data in the form of corpora or
lexicons for a certain period of time in history.
We are looking for original unpublished work in one of the following
topics but not limited to:
- character-level MT for normalisation
- historical and modern data as comparable corpora
- historical texts in different languages as parallel or comparable corpora
- MT for translation between language versions
- OCR for multilingual documents
- word- and/or paragraph-level language identification
- crosslingual retrieval in historical documents
- ontologies as language-independent interfaces between collections of
historical texts
- particularities of multilingual historical texts and challenges for IT
- information extraction and retrieval for multilingual historical documents
Authors interested in submitting a paper are required to send an email
at cristina.vertan at uni-hamburg.de containing the title , the authors
and 10 lines abstract no later than 15th May. Submissions are due to
10th of June in Form of an abstract of about 1500 words, at the same
address. Notifications of acceptance /rejection will be issued
around 25th of June.
IMPORTANT DATES
Intention email: 15th of May 2014
Submission of abstracts: 10th June 2014
Notification of acceptance / rejection 25th of June 2014
Workshop 7th July 2014
PROGRAMME COMMITTEE
? Lars Borin. (University of Gothenburg, Sweden)
? Rafael Carrasco (University of Alicante, Spain)
? Paul Doorenbosch (National Library of the Netherlands, Netherlands)
? Thorhallur Eythorsson (University of Iceland)
? Alexander Geyken (BBAW, Germany)
? Günther Görz (University Erlangen, Germany)
? Walther v. Hahn (University of Hamburg, Germany)
? Erhard Hinrichs (University of Tuebingen, Germany)
? Guillaume Jacquet (JRC, Italy)
? Marc Kupietz (IDS, Germany)
? Éric Laporte (Université Paris-Est Marne-la-Vallée, France)
? Piroska Lendvai (Hungarian Academy of Sciences, Hungary)
? Thierry Paquet (LITIS, France)
? Gábor Prószéky (MorphoLogic, Hungary)
? Bente Maegaard (University of Copenhagen, Denmark)
? Christian Emil Ore (University of Oslo, Norway)
? Eiríkur Rögnvaldsson, (University of Iceland)
? Petya Osenova (IICT, Bulgarian Academy of Sciences, Bulgaria)
? Manfred Thaller (Cologne University, Germany, Germany
? Tamás Váradi (Hungarian Academy of Sciences, Hungary)
? Matthew Whelpton, (University of Iceland.)
? Kalliopi Zervanou (University of Tilburg, the Netherlands)
ORGANISING COMMITTEE
Cristian Vertan, University of Hamburg, Germany
Stefanie Dipper, Ruhr-University Bochum, Germany
Noah Bubenhofer, TU Dresden, German / UZH Zurich, Switzerland
Laurent Romary, INRIA, France / Humbold-University Berlin, Germany
--
============================================================
Dr. Cristina Vertan
Arbeitsstelle "Computerphilologie"
und
AB. Natürlichsprachliche Systeme (NATS)
vogt-Kölln Strasse 30
22527 Hamburg
Raum/room F-534b
tel: +49 40 42883 2319
fax: +49 40 42883 2385
http://nats-www.informatik.uni-hamburg.de/CristinaVertan
============================================================
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list