[Corpora-List] tagger for Ukranian

Francis Tyers ftyers at prompsit.com
Mon Feb 7 17:25:48 UTC 2011


El dl 07 de 02 de 2011 a les 17:52 +0100, en/na Natalia Kotsyba va
escriure:
> Hi, Tony, all,
> well I am one of the UGTag people and probably the one to explain the
> situation. The project is suspended now due to the lack of funding. We
> have a large dictionary with info about wordforms, lemmas and
> morfosyntactic tags and a program which assigns lemmas and tags to
> words in texts. I say "a program" because half of the wordforms have
> several interpretations and disambiguation is not done, so it is not a
> tagger in the most general sense.
> At the moment we are working on both rules and manual disambiguation
> to train the tagger but I'd rather not give any promises about when it
> is ready. 

"Release early, release often" :)

> By the way, if there are any volunteers on the list who
> would be willing to join the disambiguation part of the project, they
> would most welcome.

Is it intended to release the result under an open-source/free licence ?
If so I know several people who may be interested and will pass the
details along to them. If you are interested in arguments for why this
would be a good idea, check out Ted Pedersen's paper here[1].

What disambiguation framework are you using for the rules ? Something
like Constraint Grammar ? 

Regards,

Fran

1. Pedersen, T. (2008) "Empiricism is not a matter of faith".
Computational Linguistics.
http://www.aclweb.org/anthology/J/J08/J08-3010.pdf

> A part of the dictionary, ab. 15 thousand lemmas and 205 thousand
> wordforms with tags, is publicly available in the Ukrainian package of
> MULTEXT-East, version 4 at http://nl.ijs.si/ME/V4/.
> 
> Tony,
> I don't think you ever contacted me or my colleagues about the
> program. Which address did you use?
> 
> Best regards,
> Natalia Kotsyba.
> 
> On 7 February 2011 15:48, Mcenery, Tony <eiaamme at exchange.lancs.ac.uk> wrote:
> > Yes, I too have been looking for a Ukrainian tagger, but to no avail. I made a real effort to get a response from the UGTag people but failed - I fear it is vapourware. I think serge Sharoff at Leeds may be planning to produce a Ukrainian tagger, but it is some way off. Best,
> >
> > Tony
> >
> > -----Original Message-----
> > From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Noah Bubenhofer
> > Sent: 07 February 2011 14:25
> > To: corpora at uib.no
> > Subject: Re: [Corpora-List] tagger for Ukranian
> >
> > Hi,
> >
> > this is indeed difficult: I discovered UGTag:
> > http://www.domeczek.pl/~polukr/parcor/
> >
> > But the development seems to be suspended, I didn't succeed to get a
> > copy of the tagger.
> >
> > ABBYY has a commercial solution for at least the morphological analysis
> > of Ukrainian. Its name is ABBYY Morphology Engine. Perhaps you can get
> > an evaluation copy of the tool.
> >
> > I'd also be interested in an Ukrainian tagger.
> >
> > Noah
> >
> > Am 07.02.2011 12:19, schrieb Montserrat Civit:
> >> Hi,
> >>
> >> Does anyone know about a tagger for Ukranian?
> >> I greatly appreciate any idea and suggestion. Thanks so much in advance!
> >>
> >>
> >> --
> >> Montserrat Civit Torruella
> >>
> >>
> >>
> >> _______________________________________________
> >> Corpora mailing list
> >> Corpora at uib.no
> >> http://mailman.uib.no/listinfo/corpora
> >
> > --
> > Dr. Noah Bubenhofer
> > Institut für Deutsche Sprache, R5 6-13, D-68161 Mannheim
> > Postadresse: Postfach 10 16 21, D-68016 Mannheim
> > Tel: +49(621) 1581-217
> > Fax: +49(621) 1581-200
> > E-Mail: bubenhofer at ids-mannheim.de
> >
> > _______________________________________________
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
> > _______________________________________________
> > Corpora mailing list
> > Corpora at uib.no
> > http://mailman.uib.no/listinfo/corpora
> >
> 
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list