[Lingtyp] US annotation issues - Hisotircal treebanks

Silvia Luraghi luraghi at unipv.it
Thu Jan 26 11:39:32 UTC 2023

Hello everyone, we are Silvia Luraghi, Chiara Zanchi, Erica Biagetti and
Luca Brigada Villa (University of Pavia) and we are starting *Universal
Dependencies for Historical Languages* (*UD4HL*), a new *discussion group* on
the annotation of treebanks of historical languages.

We would like to systematically address with colleagues working on ancient
languages problems such as:

1.     *methodological issues*, linked for example to the possibility of
including philological or historical information in the MISC field;

2.     *annotation issues*, mainly due to the fact that for historical
languages we cannot rely on the competence of speakers and we often need
more fine-grained guidelines than those of modern languages.

Our intention is not to use this group as a substitute of the already
existing and lively UD discussion groups, but rather to create an
environment where issues that are specific to the annotation of ancient
languages can be discussed among experts of such languages before they are
presented to the UD community. For each topic addressed within our group,
we plan to open a new Github issue in the UD repository, so that other
people can take part in the discussion and solutions we reached for
historical languages can be shared with the rest of the community.

We would like to start a *review process* addressing *one construction type
at a time*: in this way we would be able to fix errors due to conversion
from other annotation schemes and due to inconsistent application of the
guidelines and at the same time we would be able to assemble more
fine-grained guidelines for new treebank developers. In our view,
correcting the annotation of ancient language treebanks does not mean
changing UD rules, but try and find the most linguistically accurate
solutions within those rules. For example, Chiara Zanchi, Erica Biagetti
and Francesco Mambrini (2022) focused on how to apply the UD guidelines on
trivalent verbs to double accusative constructions in the *Homeric
Dependency Treebank*. In this case, more fine-grained guidelines served to
indicate to the annotators of Ancient Greek (and eventually Sanskrit,
Latin, etc.) treebanks how to identify the second and third argument in
such constructions (mainly given that there are no native speakers of these
languages and we therefore cannot rely on intuition).

Besides discussing them with the UD community, we would like to present
some of the outcomes of the discussion at the workshop *Exploiting
standardized cross-linguistic data in historical linguistics* organized by
Robert Forkel, Gerhard Jäger and Johann-Mattis List at the *International
Conference on Historical Linguistics *(ICHL26, 4-8 September 2023). You can
find more details on the workshop at the following link:

We would be really happy if you would like to take part in the discussion!
If so, we kindly ask you to fill out the form below and forward it to your

*Link to the Google form*: https://forms.gle/iwoU5WWYuZH1JXQHA

Silvia Luraghi
Università di Pavia
Dipartimento di Studi Umanistici, Sezione di Linguistica
Strada Nuova 65
I-27100 Pavia
tel.: +39/0382/984685
Web page personale: https://studiumanistici.unipv.it/?pagina=docenti&id=68
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/lingtyp/attachments/20230126/561a060b/attachment-0001.htm>

More information about the Lingtyp mailing list