[Corpora-List] 2nd CfP: ACRH3 "Annotation of Corpora for Research in the Humanities"

Tue May 21 09:03:02 UTC 2013

[Apologies for cross-posting]

-----------------------------------------------------
                  Third Workshop on
Annotation of Corpora for Research in the Humanities
                     (ACRH-3)

 -- In memory of father Roberto Busa (1913-2011) --

  *** Submission deadline: September 15, 2013 ***
------------------------------------------------------ 

The third edition of the Workshop on "Annotation of Corpora for Research 
in the Humanities" (ACRH-3) will be held on December 12, 2013 at the 
University of Sofia (Bulgaria) (http://www.bultreebank.org/ACRH-3/).

Submissions are invited for oral presentations and posters (with or 
without
demonstrations) featuring high quality and previously unpublished research 
on
the topics described below. Contributions should focus on results from 
completed
as well as ongoing research, with an emphasis on novel approaches, 
methods,
ideas, and perspectives, whether descriptive, theoretical, formal or
computational.

Proceedings will be published in time for the workshop. The full 
proceedings of
the previous two editions of ACRH are respectively available at 
www.jlcl.org
(ACRH-1) and at http://alfclul.clul.ul.pt/crpc/acrh2/ACRH-2_FINAL.pdf 
(ACRH-2).

The workshop will be co-located with the Twelfth International Workshop on
"Treebanks and Linguistic Theories" (TLT-12), which will be held on 
December
13-14, 2013 (http://www.bultreebank.org/TLT12/).

This edition of ACRH will be dedicated to the memory of father Roberto 
Busa, to
celebrate the 100th anniversary of his birth (November 28, 1913). ACRH-3 
will
devote one special session to father Busa. This section will feature one
introduction and one invited talk, which will be given by the recipient of 
the
2013 Busa Award, Prof. Willard McCarty (King's College, London, UK).

MOTIVATION AND AIMS

Research in the Humanities is predominantly text-based. For centuries 
scholars
have studied documents such as historical manuscripts, literary works, 
legal
contracts, diaries of important personalities, old tax records etc.

Manual analysis of such documents is still the dominant research paradigm 
in the
Humanities. However, with the advent of the digital age this is 
increasingly
complemented by approaches that utilise digital resources. More and more 
corpora
are made available in digital form (theatrical plays, contemporary novels,
critical literature, literary reviews etc.). This has a potentially 
profound
impact on how research is conducted in the Humanities.

Digitised sources can be searched more easily than traditional, 
paper-based
sources, allowing scholars to analyse texts quicker and more 
systematically.
Moreover, digital data can also be (semi-)automatically mined: important 
facts,
trends and interdependencies can be detected, complex statistics can be
calculated and the results can be visualised and presented to the 
scholars, who
can then delve further into the data for verification and deeper analysis.

Digitisation encourages empirical research, opening the road for 
completely new
research paradigms that exploit `big data' for humanities research. This 
has
also given rise to Digital Humanities (or E-Humanities) as a new research 
area.

Digitisation is only a first step, however. In their raw form, electronic
corpora are of limited use to humanities researchers. The true potential 
of such
resources is only unlocked if corpora are enriched with different layers 
of
linguistic annotation (ranging from morphology to semantics). While corpus
annotation can build on a long tradition in (corpus) linguistics and
computational linguistics, corpus and computational linguistics on the one 
side
and the Humanities on the other side have grown apart over the past 
decades.

The ACRH workshop aims at building a tighter collaboration between people
working in various areas of the Humanities (such as literature, philology,
history etc.) and the research community involved in developing, using and
making accessible annotated corpora. We believe that such a collaboration 
is now
needed because, while annotating a corpus from scratch still remains a
labor-intensive and time-consuming task, today this is simplified by 
intensively
exploiting prior experience in the field. Actually, such a interplay is 
still
quite far from being achieved, as a gap still holds between computational
linguists (who sometimes do not involve humanists in developing and 
exploiting
annotated corpora for the Humanities) and humanists (who sometimes just 
ignore
that such corpora do exist and that automatic methods and standards to 
build
them are today available).

Although many corpora that play a relevant role for research in Humanities 
are
today available in digital format, only a few of them are linguistically 
tagged,
while most still lack linguistic tagging at all. Over the past few years a
number of historical annotated corpora have been started, among which are
treebanks for Middle, Early Modern and Old English, Early New High German,
Medieval Portuguese, Ugaritic, Latin, Ancient Greek and several 
translations of
the New Testament into Indo-European languages. The experience of these
ever-growing set of projects can provide many suggestions on the 
methodology as
well as on the practice of interaction between literary studies, philology 
and
corpus linguistics.

TOPICS

To overcome the above mentioned issues, ACRH-3 aims at covering a wide 
range of
topics related to the annotation of corpora for research in the 
Humanities.

The topics to be addressed in the workshop include (but are not limited 
to) the
following:

- specific issues related to the annotation of corpora for research in the
Humanities

- annotated corpora as a basis for research in the Humanities

- diachronic, historical and literary annotated corpora

- use of annotated corpora for stylometrics and authorship attribution

- philological issues, like different readings, textual variants, 
apparatus,
non-standard orthography and spelling variation

- annotation principles and schemes of corpora for research in the 
Humanities

- adaptation of NLP tools for older language varieties

- integration of annotated corpora for the Humanities into language 
resources
infrastructures

- tools for building and accessing annotated corpora for the Humanities

- examples of fruitful collaboration between Computational Linguistics and
Humanities in building and exploiting annotated corpora

INVITED SPEAKER: Willard McCarty (King's College, London, UK)

IMPORTANT DATES

Deadlines: always midnight, UTC ('Coordinated Universal Time'), ignoring 
DST
('Daylight Saving Time'):

- Deadline for paper submission:     September 15, 2013

- Notification of acceptance:        October 18, 2013

- Final version of paper:            November 17, 2013

- Workshop:                         December 12, 2013

INSTRUCTIONS FOR SUBMISSION

We invite to submit full papers describing original, unpublished research
related to the topics of the workshop. Papers should not exceed 12 pages.

The language of the workshop is English. All papers must be submitted in
well-checked English.

Papers should be submitted in PDF format only. Submissions have to be made 
via
the EasyChair page of the workshop at
https://www.easychair.org/conferences/?conf=acrh3. Please, first register 
at
EasyChair if you do not have an EasyChair account.

The style guidelines follow the specifications required by TLT. They can 
be
found here: http://www.bultreebank.org/ACRH-3/StyleGuidelines.html.

Please, note that as reviewing will be double-blind, the papers should not
include the authors' names and affiliations or any references to 
web-sites,
project names etc. revealing the authors' identity. Furthermore, any
self-reference should be avoided. For instance, instead of "We previously 
showed
(Brown, 2001)...", use citations such as "Brown previously showed (Brown,
2001)...". Each submitted paper will be reviewed by three members of the 
program
committee.

Submitted papers can be for oral or poster presentations (with or without 
demo).
There is no difference between the different kinds of presentation both in 
terms
of reviewing process and publication in the proceedings (the limit of 12 
pages
holds for both oral and poster presentations).

ORAL PRESENTATION

The oral presentations at the workshop will be 30 minutes long (25 minutes 
for
presentation and 5 minutes for questions and discussion).

PROGRAM COMMITTEE CHAIRS

- Francesco Mambrini (Deutsches Archäologisches Institut, Berlin, Germany)

- Marco Passarotti (Università Cattolica del Sacro Cuore, Milan, Italy)

- Caroline Sporleder (University of Trier, Germany)

PROGRAM COMMITTEE MEMBERS

- Stefanie Dipper (Germany)

- Voula Giouli (Greece)

- Iris Hendrickx (Portugal)

- Erhard Hinrichs (Germany)

- Cerstin Mahlow (Switzerland)

- Alexander Mehler (Germany)

- Jirí Mírovský (Czech Republic)

- Christian-Emil Smith Ore (Norway)

- Michael Piotrowski (Germany)

- Paul Rayson (UK)

- Martin Reynaert (The Netherlands)

- Jeff Rydberg Cox (USA)

- Kiril Simov (Bulgaria)

- Stefan Sinclair (Canada)

- Mark Steedman (UK)

- Frank Van Eynde (Belgium)

- Martin Wynne (UK)

LOCAL ORGANIZATION

- Petya Osenova (University of Sofia, Bulgaria)

- Kiril Simov (IICT-BAS)

- Stanislava Kancheva (University of Sofia, Bulgaria)

- Georgi Georgiev (Ontotext)

- Borislav Popov (Ontotext)

-- 
---------------------------------------------------------
Caroline Sporleder
Computational Linguistics
& Digital Humanities
Trier University
---------------------------------------------------------
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora