[Corpora-List] compound analyzers

Peter Adolphs peter.adolphs at student.hu-berlin.de
Mon Sep 25 09:19:22 UTC 2006


Hi!

Lars Nygaard wrote:
> I'm looking for an open source compound analyzer for germanic-style
> compounds. I should be able to produce all possible solutions of words,
> even unlikely ones (thus ruling out, for example, Hunspell).

I don't really understand your question. There are several engines
available, as for instance Ocamorph or SFST. However the actual
interesting part should be the lexical data.

For German, there is for example the open-sourced version of SMOR, a
finite-state morphology implemented in SFST. It has a mechanism for
compounding (the same as for derivation), which is basically
concatenation of morphemes with feature checking for morphotactics. But
it heavily relies on a lexicon, and there is only a very small demo
lexicon available.

Best regards!

-- 
Peter Adolphs    peter.adolphs at student.hu-berlin.de    gpg/pgp welcome!



More information about the Corpora mailing list