[Corpora-List] Summary of responses: German lemma list

Niels Ott niels at drni.de
Sat Mar 10 16:57:10 UTC 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear all,

over a week ago I asked for a German lemma list. I received a number of
replies. From all suggestions made, the one  of extracting a lemma list
from the ispell word list won the race... because this was the easiest
thing to do in the limited time we had.

Let me briefly summarize the suggestions I received both on the list and
in private (in no particular order):

Annette Klosa offered a contract over academic use of the word list from
the Elexico project which is based in frequency data from the German IDS
corpora. http://www.elexiko.de/

Lars Aronson was the one who suggested to use German spell checker
dictionaries, namely those of ispell/aspell/myspell/hunspell.*

René Witte suggested to have a look at the Durm Lemmatizer which
apparently comes with a lexicon.*
http://www.ipd.uni-karlsruhe.de/~durm/tm/lemma/

Yannick Versley suggested to use the lexicon of the CDG parser.*
http://nats-www.informatik.uni-hamburg.de/view/CDG/DownloadPage

Peter Adolphs suggested to have a look at Morphy by Wolfgang Lezius
which can export the lexical data it uses. http://www.wolfganglezius.de/

[*]: Those are (part of) open source projects.

Thank you very much for your assistance!

Regards,

   Niels Ott


Niels Ott schrieb:
> Dear all,
> 
> about a month ago there as a little discussion going on here about
> English lemma lists.
> 
> We should have a lemma list for German. There is no special requirement
> but containing lemmata, e.g.
> 
> Haus
> Katze
> gehen
> sitzen
> 
> Furthermore it would be nice if the list was equipped with POS. But
> that's not a strict requirement.
> 
> It would be admirable if this list was free in the sense of free
> speech/open source or if use was restricted to non-commercial
> applications. (This is for a student's project at Univ.)
> 
> Thank you very much in advance for your assistance.
> 
> Regards,
> 
>    Niels Ott
> 
> 

- --
Niels Ott - Computational Linguist (B.A.) - http://www.drni.de/niels/
Tangente: Veralgter Wasservogel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFF8uNmbosnVosUgx0RAkg/AJ4wKmPcKI3s0aSiDB6OL7QfYJyKfgCeLZ8a
Byz/Td4bitSXc3nUcymTmWw=
=88T4
-----END PGP SIGNATURE-----



More information about the Corpora mailing list