[Corpora-List] grammatical annotation of Ancient Greek and Latin texts: the summary

Sat Dec 21 23:26:24 UTC 2002

Here's the promised summary. Answers have been too few as to provide
a scheme for a general view of the state of the Grammatical
annotation of the ancient Greek and Latin corpus of texts, but I hope
this very brief, incomplete and imperfect account can be of some use
to somebody (mostly classicists, there's not a lot for computer
linguists). I have interspersed many comments of my own. I am still
interested in receiving new information about this subject.

Many thanks to Eckhard Bick , Rodney Decker, Tim Finney, Dennis
Hukel, Kathleen McNamee, Tito Orlandi and specially to Anne Mahoney
who sent me a very useful resume of Perseus notation schemes for the
morphological annotation of the Perseus texts.

	Well, as far as I have been able to collect, and putting
aside the Bible and the corpus of the earliest Greek Christian texts,
it seems to me that yet there is not a lot going on in the field of
the grammatical annotation of the ancient Greek and Latin corpus
outside the aegis of the Perseus project. Most of the work already
done and the existing projects concerns only the morphological
annotation of the texts, be it with automatic taggers (not
disambiguated) or with heavy intervention of human operators
(disambiguated), this being the preferred approach for Biblical
literature texts.

	On the Greek side we have the well known collection of
morphollogically tagged literary texts and subliterary and
documentary papyri and ostraka from the Perseus project and the Duke
Databank, accessible trough the Web (<http://www.perseus.tufts.edu/>,
<http://www.perseus.tufts.edu/cache/perscoll_DDBDP.html>). Few
projects in humanities can compare to the Perseus project in scope,
progression, the mass of collected and processed data and public
success. Latinists have, in addition to the Latin pages of the
Perseus site, a debt with the Belgian classicists in account of the
remarkable CETEDOC / CTLO CD ROMS (see
<http://www.brepols.net/publishers/cd-rom.htm#CLCLT> and the LASLA/
CIPL databases (<http://www.ulg.ac.be/cipl/lsl.htm>). (see
http://bcs.fltr.ucl.ac.be/DicLanD.html for info on both projects).

	Bible students and researchers of the New Testament texts and
related literature (in Greek, Latin and Hebrew) have unparalleled
electronic tools at their disposal with BibleWindows
(http://www.silvermnt.com/bwinfo.htm) and specially the Accordance
(<http://www.oaksoft.com/>). See reviews in
<http://www.swcp.com/~kfapa/Bible/index.html>. Yet, there is not much
to syntax there. The electronic version of Nestle-Aland critical
edition of the Greek New Testament looks promising (see
<http://www.uni-tuebingen.de/cgi-bin/abs/abs?propid=54>.)

	There are a few sites devoted to the grammatical analysis of
the Greek-Latin corpus, like  http://visl.sdu.dk (offers a small Tree
Bank of Greek and Latin analysed sentences with a tree visualizer).

	The OpenText project (<http:// www.opentext.org>) offers a
very interesting proposal for other kind of text annotation for the
edition of ancient texts (mainly intended for papyri).

	For the serious exploitation of syntactically analysed
corpora specifically designed for Greek texts (my main concern) there
is much less to see. Any researcher interested in publicising his
work might be interested in the TIGER facilities for the public
exploitation of existing banks of syntax graphs. Don't miss the Tiger
Project page
<http://www.ims.uni-stuttgart.de/projekte/TIGER/annotation/lfg/parsing/>
and see what's there!

	To the best of my knowledge, there is not any project to
develop a parser for the automatic analysis of Ancient Greek texts,
and that's a pity. No matter how tentative such analyses are still
today (and will remain in the foreseeable future), the creation of
such a tool(s) would help syntacticians to improve the existing
grammars and would provide the general scholar with annotated texts
(after heavy human intervention).

	Some years ago I wrote a paper on the syntactical annotation
of the corpus of ancient Greek text [[1]]. In this paper, after a
short introduction to the concept of parsers and grammatical editors,
I expressed my personal conviction that: a) both kind of computer
tools should be developed fast for the good of Greek and Latin
studies; b) projects should start sharing the results obtained with
such tools i.e. the annotated corpora; c) considering the needs of
classicists today, the use of grammatical editors would probably be
the preferred choice of classicists.  [A grammatical editor is a
program that offers i) the interface allowing a human operator to
grammatically parse a text on a computer, facilitating some of the
tasks involved; ii) the tools to store, search, retrieve, and compile
statistics about the text corpora thus parsed; iii) the interface to
present the final user the results of any of the above mentioned
operations]

	Conclusion (c), earnest as it was, was not however absolutely
disinterested: in the same paper I presented my own grammatical
editor, called Aristarchus. If anybody is interested, I can send him
the pdf version of this paper.

	I owe to Kathleen McNamee the tangential to this point but
nonetheless very interesting information about the release (due in
2003) of the "Commentaria et Lexica Graeca in Papyris reperta", a
major edition of ancient grammatical commentaries edited by G.
Bastianini, H. Maehler, M. Haslam, and C. Römer. Thanks!

[[1]] Riaño Rufilanchas, Daniel. 1998. Análisis y etiquetado
sintáctico del corpus de los textos clásicos: modelos y perspectivas.
Studia Iranica, Mesopotamica & Anatolica 3:107-129.

>I wrote:
>Dear List,
>
>I am collecting information about the existing corpora of
>grammatically annotated texts in Greek and Latin, the tools used for
>the annotation or the edition of the texts to be analysed, and the
>schemes of grammatical annotation. Any information, in or off list,
>would be greatly appreciated.
>
>I am aware of the main projects for the grammatical annotation of
>several Biblical texts and mainly the Greek New Testament
>(Accordance, BibleWorks) and of course the Perseus Project, but
>would also appreciate any bibliographcal aid or useful link to the
>technical description of the grammatical background and the
>annotation schemes of such projects. I'd also appreciate any
>information about present and future projects on the same field
>and/or in other ancient languages, too. Any province of the
>grammatical oecumene is relevant. If there is enough off list input
>about the matter, I will summarise to the list(s). Many thanks in
>advance,
>
>Daniel

--
~~~~~~~~~~~~~~~~~~~
Daniel Riaño Rufilanchas
Madrid, España