[Corpora-List] Re: Minor(ity) Language (was: 'Standard European English' )

Chantal ENGUEHARD Chantal.Enguehard at univ-nantes.fr
Wed Mar 8 16:23:25 UTC 2006


I use the term "under-resourced" language to name the languages having a few
linguist resources (dictionnaries, grammars) and also languages that are
poorly supported by computers.
The number of speaker is not at all taken in account.

I get the impression that a lot of different terms are appearing that do not
designate exactly the same concepts. But they are often confused because some
languages below at the same time to differennt categories of languages.
For instance an "endangered language" can be als a "minor languagae" and an
"under-resourced langague"

Chantal Enguehard (please, excuse my poor english)

Note : [In 2004, vincent Berment defined in his thesis* an evaluation grid to
note precisely what is the degree of computerization of any language. This
grid allow to calculate a number (a note on a scale of 20 points).
If this number is less than 10 points, the language is said to be a
pi-language (pi being the greek letter p).
If this number is more than 14 points, the language is said to be a
tau-language (tau being the greek letter t).
Otherwise the language is said to be a mu-language (mu being the greek letter
m).]

* Vincent Berment, "Méthodes pour informatiser des langues et des groupes de
langues “peu dotées”", thèse de doctorat, GETA, Laboratoire CLIPS, IMAG,
Université Joseph Fourier, 18 mai 2004.


Chantal ENGUEHARD
LINA
2, rue de la Houssinière
BP 92208
44322 Nantes Cedex 03
France



More information about the Corpora mailing list