[Corpora-List] Irish language corpora

Ciar án Ó Duibhín ciaran at oduibhin.freeserve.co.uk
Sat Nov 25 23:08:42 UTC 2006


Ronan Fitzgerald wrote:

> I am looking for a corpus of Irish language for some research, but all I
> seem to be able to find are corpora based on literary texts, predominantly
> dated from before the 20th Century.  For my research purposes, I need a
> corpus that contains terminology that is as contemporary as possible.

Interesting, Ronan.  I wonder if your research will bear out my experience
that contemporary Irish (say, Irish from the last quarter of the 20th
century), being written overwhelmingly by people belonging to a tradition
whose first language is English, is heavily based on English semantics, and
is substantially different (lexicographically, in particular) from what may
be called "continuity Irish", as transmitted - in three main dialects - by
native speakers of Irish.

I have about 3M words of literary continuity Irish from the first half of
the 20th century (see http://www.smo.uhi.ac.uk/~oduibhin/tobar/index.htm for
some information) and as you say there are some other corpora of this sort.
But the differences between this and non-continuity Irish may well not be an
aspect with which you will be concerned.

On computing terminology in Irish, I offer some thoughts in the light of the
continuity/non-continuity divide in
http://www.smo.uhi.ac.uk/~oduibhin/tearmai/index.htm.

I hope this is some help, and in any case, good luck with your work.

Ciarán Ó Duibhín.



More information about the Corpora mailing list