The work that Anna Feldman, Jiri Hana and I did is designed to produce a reasonably good tagger with<div>very little effort. I would be really pleased if this approach turned out to be of value for Ukrainian. Our system's probability model has two components: the first (the transition component) says how one part-of-speech follows another,  the second (the emission component) says how  individual words are associated with parts of speech. Anna's dissertation includes follow-on experiments motivated by the hope that the emission component can be significantly improved using traditional linguistic notions such as cognatehood. Our system can definitely use some insights from traditional philology in this area, and Ukrainian might be just the language where success can be demonstrated, given the long history of contact with Russian.</div>

<div><br></div><div>Chris</div><div><br><div class="gmail_quote">2011/2/8 Natalia Kotsyba <span dir="ltr"><<a href="mailto:gnatko@gmail.com">gnatko@gmail.com</a>></span><br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

Thanks to all for the comments and advice, it is really motivating.<br>

<div class="im"><br>

>> By the way, if there are any volunteers on the list who<br>

>> would be willing to join the disambiguation part of the project, they<br>

>> would most welcome.<br>

><br>

> Is it intended to release the result under an open-source/free licence ?<br>

<br>

</div>Yes, the ultimate goal is a free web-service with somewhat abridged<br>

(for copyright reasons) but still reasonable for work dictionary.<br>

Meanwhile, taken that the interest in the resource exists, we are<br>

preparing a command-line version to be placed on sourceforge, which I<br>

hope to announce on the list by the end of this week.<br>

<div class="im"><br>

> If so I know several people who may be interested and will pass the<br>

> details along to them. If you are interested in arguments for why this<br>

> would be a good idea, check out Ted Pedersen's paper here[1].<br>

><br>

> What disambiguation framework are you using for the rules ? Something<br>

> like Constraint Grammar ?<br>

<br>

</div>I am focusing on LanguageTool now, <a href="http://www.languagetool.org/" target="_blank">http://www.languagetool.org/</a>,<br>

hoping to involve eventually people with traditional education in<br>

Ukrainian philology for whom it would be friendly enough to work<br>

further on disambiguation rules and other available features. If you<br>

have other suggestions, I would be glad to hear them.<br>

<br>

Regards,<br>

<font color="#888888">Natalia.<br>

</font><div><div></div><div class="h5"><br>

_______________________________________________<br>

Corpora mailing list<br>

<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>

<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>

</div></div></blockquote></div><br><br clear="all"><br>-- <br>Chris Brew, Ohio State University<br>

</div>