[Corpora-List] 2nd CfP: ACRH3 "Annotation of Corpora for Research in the Humanities"
Caroline Sporleder
csporled at coli.uni-sb.de
Tue May 21 09:03:02 UTC 2013
[Apologies for cross-posting]
-----------------------------------------------------
Third Workshop on
Annotation of Corpora for Research in the Humanities
(ACRH-3)
-- In memory of father Roberto Busa (1913-2011) --
*** Submission deadline: September 15, 2013 ***
------------------------------------------------------
The third edition of the Workshop on "Annotation of Corpora for Research
in the Humanities" (ACRH-3) will be held on December 12, 2013 at the
University of Sofia (Bulgaria) (http://www.bultreebank.org/ACRH-3/).
Submissions are invited for oral presentations and posters (with or
without
demonstrations) featuring high quality and previously unpublished research
on
the topics described below. Contributions should focus on results from
completed
as well as ongoing research, with an emphasis on novel approaches,
methods,
ideas, and perspectives, whether descriptive, theoretical, formal or
computational.
Proceedings will be published in time for the workshop. The full
proceedings of
the previous two editions of ACRH are respectively available at
www.jlcl.org
(ACRH-1) and at http://alfclul.clul.ul.pt/crpc/acrh2/ACRH-2_FINAL.pdf
(ACRH-2).
The workshop will be co-located with the Twelfth International Workshop on
"Treebanks and Linguistic Theories" (TLT-12), which will be held on
December
13-14, 2013 (http://www.bultreebank.org/TLT12/).
This edition of ACRH will be dedicated to the memory of father Roberto
Busa, to
celebrate the 100th anniversary of his birth (November 28, 1913). ACRH-3
will
devote one special session to father Busa. This section will feature one
introduction and one invited talk, which will be given by the recipient of
the
2013 Busa Award, Prof. Willard McCarty (King's College, London, UK).
MOTIVATION AND AIMS
Research in the Humanities is predominantly text-based. For centuries
scholars
have studied documents such as historical manuscripts, literary works,
legal
contracts, diaries of important personalities, old tax records etc.
Manual analysis of such documents is still the dominant research paradigm
in the
Humanities. However, with the advent of the digital age this is
increasingly
complemented by approaches that utilise digital resources. More and more
corpora
are made available in digital form (theatrical plays, contemporary novels,
critical literature, literary reviews etc.). This has a potentially
profound
impact on how research is conducted in the Humanities.
Digitised sources can be searched more easily than traditional,
paper-based
sources, allowing scholars to analyse texts quicker and more
systematically.
Moreover, digital data can also be (semi-)automatically mined: important
facts,
trends and interdependencies can be detected, complex statistics can be
calculated and the results can be visualised and presented to the
scholars, who
can then delve further into the data for verification and deeper analysis.
Digitisation encourages empirical research, opening the road for
completely new
research paradigms that exploit `big data' for humanities research. This
has
also given rise to Digital Humanities (or E-Humanities) as a new research
area.
Digitisation is only a first step, however. In their raw form, electronic
corpora are of limited use to humanities researchers. The true potential
of such
resources is only unlocked if corpora are enriched with different layers
of
linguistic annotation (ranging from morphology to semantics). While corpus
annotation can build on a long tradition in (corpus) linguistics and
computational linguistics, corpus and computational linguistics on the one
side
and the Humanities on the other side have grown apart over the past
decades.
The ACRH workshop aims at building a tighter collaboration between people
working in various areas of the Humanities (such as literature, philology,
history etc.) and the research community involved in developing, using and
making accessible annotated corpora. We believe that such a collaboration
is now
needed because, while annotating a corpus from scratch still remains a
labor-intensive and time-consuming task, today this is simplified by
intensively
exploiting prior experience in the field. Actually, such a interplay is
still
quite far from being achieved, as a gap still holds between computational
linguists (who sometimes do not involve humanists in developing and
exploiting
annotated corpora for the Humanities) and humanists (who sometimes just
ignore
that such corpora do exist and that automatic methods and standards to
build
them are today available).
Although many corpora that play a relevant role for research in Humanities
are
today available in digital format, only a few of them are linguistically
tagged,
while most still lack linguistic tagging at all. Over the past few years a
number of historical annotated corpora have been started, among which are
treebanks for Middle, Early Modern and Old English, Early New High German,
Medieval Portuguese, Ugaritic, Latin, Ancient Greek and several
translations of
the New Testament into Indo-European languages. The experience of these
ever-growing set of projects can provide many suggestions on the
methodology as
well as on the practice of interaction between literary studies, philology
and
corpus linguistics.
TOPICS
To overcome the above mentioned issues, ACRH-3 aims at covering a wide
range of
topics related to the annotation of corpora for research in the
Humanities.
The topics to be addressed in the workshop include (but are not limited
to) the
following:
- specific issues related to the annotation of corpora for research in the
Humanities
- annotated corpora as a basis for research in the Humanities
- diachronic, historical and literary annotated corpora
- use of annotated corpora for stylometrics and authorship attribution
- philological issues, like different readings, textual variants,
apparatus,
non-standard orthography and spelling variation
- annotation principles and schemes of corpora for research in the
Humanities
- adaptation of NLP tools for older language varieties
- integration of annotated corpora for the Humanities into language
resources
infrastructures
- tools for building and accessing annotated corpora for the Humanities
- examples of fruitful collaboration between Computational Linguistics and
Humanities in building and exploiting annotated corpora
INVITED SPEAKER: Willard McCarty (King's College, London, UK)
IMPORTANT DATES
Deadlines: always midnight, UTC ('Coordinated Universal Time'), ignoring
DST
('Daylight Saving Time'):
- Deadline for paper submission: September 15, 2013
- Notification of acceptance: October 18, 2013
- Final version of paper: November 17, 2013
- Workshop: December 12, 2013
INSTRUCTIONS FOR SUBMISSION
We invite to submit full papers describing original, unpublished research
related to the topics of the workshop. Papers should not exceed 12 pages.
The language of the workshop is English. All papers must be submitted in
well-checked English.
Papers should be submitted in PDF format only. Submissions have to be made
via
the EasyChair page of the workshop at
https://www.easychair.org/conferences/?conf=acrh3. Please, first register
at
EasyChair if you do not have an EasyChair account.
The style guidelines follow the specifications required by TLT. They can
be
found here: http://www.bultreebank.org/ACRH-3/StyleGuidelines.html.
Please, note that as reviewing will be double-blind, the papers should not
include the authors' names and affiliations or any references to
web-sites,
project names etc. revealing the authors' identity. Furthermore, any
self-reference should be avoided. For instance, instead of "We previously
showed
(Brown, 2001)...", use citations such as "Brown previously showed (Brown,
2001)...". Each submitted paper will be reviewed by three members of the
program
committee.
Submitted papers can be for oral or poster presentations (with or without
demo).
There is no difference between the different kinds of presentation both in
terms
of reviewing process and publication in the proceedings (the limit of 12
pages
holds for both oral and poster presentations).
ORAL PRESENTATION
The oral presentations at the workshop will be 30 minutes long (25 minutes
for
presentation and 5 minutes for questions and discussion).
PROGRAM COMMITTEE CHAIRS
- Francesco Mambrini (Deutsches Archäologisches Institut, Berlin, Germany)
- Marco Passarotti (Università Cattolica del Sacro Cuore, Milan, Italy)
- Caroline Sporleder (University of Trier, Germany)
PROGRAM COMMITTEE MEMBERS
- Stefanie Dipper (Germany)
- Voula Giouli (Greece)
- Iris Hendrickx (Portugal)
- Erhard Hinrichs (Germany)
- Cerstin Mahlow (Switzerland)
- Alexander Mehler (Germany)
- Jirí Mírovský (Czech Republic)
- Christian-Emil Smith Ore (Norway)
- Michael Piotrowski (Germany)
- Paul Rayson (UK)
- Martin Reynaert (The Netherlands)
- Jeff Rydberg Cox (USA)
- Kiril Simov (Bulgaria)
- Stefan Sinclair (Canada)
- Mark Steedman (UK)
- Frank Van Eynde (Belgium)
- Martin Wynne (UK)
LOCAL ORGANIZATION
- Petya Osenova (University of Sofia, Bulgaria)
- Kiril Simov (IICT-BAS)
- Stanislava Kancheva (University of Sofia, Bulgaria)
- Georgi Georgiev (Ontotext)
- Borislav Popov (Ontotext)
--
---------------------------------------------------------
Caroline Sporleder
Computational Linguistics
& Digital Humanities
Trier University
---------------------------------------------------------
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list