[Corpora-List] Searching for Morphological Analyzers
Mike Maxwell
maxwell at ldc.upenn.edu
Thu Jun 24 14:52:06 UTC 2004
(I replied to this earlier, but accidently only sent it to the original
poster. Since unlike Linguist List, this list doesn't have a "reply to
sender and let them summarize" policy--at least I don't think it
does--I'm sending this to the list, in case there are others interested.
Apologies if that's not kosher.)
Navot Akiva wrote:
> I'm a PhD student, and I'm searching for good (hopefully free)
> morphological analyzers for English, Turkish, and Finnish and Hebrew .
--------------------
English:
SIL provides the free 'Englex' grammar, which runs under their free
PK-KIMMO parser http://www.sil.org/pckimmo/v2/englex.html,
http://www.sil.org/pckimmo/about_pc-kimmo.html.
--------------------
Finnish: Commercial programs with demos:
Fintwol developed by Kimmo Koskenniemi, and a company called Lingsoft. A
demo is available at http://www.lingsoft.fi/cgi-bin/fintwol
Xerox demo at
http://www.xrce.xerox.com/competencies/content-analysis/demos/finnish.
(Their web site frequently changes, so you may have to look around a bit
to find this.)
--------------------
Turkish:
A PC-KIMMO analyzer is available at
http://www-2.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/morph/pc_kimmo/turklex/turklex.tgz
Xerox and Bilkent University have done one more recently, but the links
were dead the last time I checked. Some years ago, Jorge Hankamer, of
UC Santa Cruz(?) had written a Turkish morphological parser in C. I
don't know what the status of it is now.
--------------------
Hebrew:
The web page at
http://www.cs.technion.ac.il/~erelsgl/bxi/hmntx/teud_tokna.html#English
includes source files for three parsers (including lexicon files) for a
general perser (all parses), a probabilistic parser, and a parser that
uses a syntactic parse (full? Partial?) to disambiguate. Should be
buildable under Windows and Unix.
Mike Maxwell
Linguistic Data Consortium
More information about the Corpora
mailing list