[Corpora-List] Summary of responses: German lemma list

Niels Ott niels at drni.de
Sat Mar 10 16:57:10 UTC 2007

Hash: SHA1

Dear all,

over a week ago I asked for a German lemma list. I received a number of
replies. From all suggestions made, the one  of extracting a lemma list
from the ispell word list won the race... because this was the easiest
thing to do in the limited time we had.

Let me briefly summarize the suggestions I received both on the list and
in private (in no particular order):

Annette Klosa offered a contract over academic use of the word list from
the Elexico project which is based in frequency data from the German IDS
corpora. http://www.elexiko.de/

Lars Aronson was the one who suggested to use German spell checker
dictionaries, namely those of ispell/aspell/myspell/hunspell.*

René Witte suggested to have a look at the Durm Lemmatizer which
apparently comes with a lexicon.*

Yannick Versley suggested to use the lexicon of the CDG parser.*

Peter Adolphs suggested to have a look at Morphy by Wolfgang Lezius
which can export the lexical data it uses. http://www.wolfganglezius.de/

[*]: Those are (part of) open source projects.

Thank you very much for your assistance!


   Niels Ott

Niels Ott schrieb:
> Dear all,
> about a month ago there as a little discussion going on here about
> English lemma lists.
> We should have a lemma list for German. There is no special requirement
> but containing lemmata, e.g.
> Haus
> Katze
> gehen
> sitzen
> Furthermore it would be nice if the list was equipped with POS. But
> that's not a strict requirement.
> It would be admirable if this list was free in the sense of free
> speech/open source or if use was restricted to non-commercial
> applications. (This is for a student's project at Univ.)
> Thank you very much in advance for your assistance.
> Regards,
>    Niels Ott

- --
Niels Ott - Computational Linguist (B.A.) - http://www.drni.de/niels/
Tangente: Veralgter Wasservogel
Version: GnuPG v1.4.2.2 (GNU/Linux)


More information about the Corpora mailing list