[Corpora-List] Unsupervised pos tagger

Osman Baskaya osbaskaya at gmail.com
Fri Feb 14 21:51:09 UTC 2014


Hi Matias,

Yatbaz and his colleagues are working on Unsupervised POS induction and
achieves state-of-the-art scores over ~16 languages. This paper is
submitted to ACL '14. I am not sure how big your corpora will be but you
may want to try it.

https://github.com/ai-ku/upos_2014


On Fri, Feb 14, 2014 at 10:58 AM, Matías Guzmán Naranjo <
mortem.dei at gmail.com> wrote:

> Thanks Grzegorz, I'll take a look. I don't really need the tagger to tell
> me whether a particular word is a verb or a noun, but to tell me which
> words appear to belong to the same grammatical class, whatever that class
> might be. I need to analyze examples taken from corpora of languages I
> don't know, so a bit of initial help would make things easier
>
>
> 2014-02-14 9:52 GMT+01:00 Grzegorz Chrupała <G.A.Chrupala at uvt.nl>:
>
> Hi Matías,
>>
>> I think fully unsupervised POS tagging isn't yet quite good enough to
>> be useful for end users. But it depends on what exactly you need.
>>
>> Have a look at the papers from the shared task at the 2012 Workshop on
>> Inducing Linguistic Structure:
>> http://wiki.cs.ox.ac.uk/InducingLinguisticStructure/SharedTask
>> --
>> Grzegorz
>>
>> On Fri, Feb 14, 2014 at 12:16 AM, Matías Guzmán Naranjo
>> <mortem.dei at gmail.com> wrote:
>> > Dear all,
>> >
>> > I would like to hear your opinions on which is/are the best
>> unsupervised pos
>> > tagger/s (preferably foss). I'm working with corpora for many different
>> > languages for which there are no specific taggers developed and need to
>> get
>> > a basic idea about parts of speech.
>> >
>> > Thanks!
>> >
>> > Matías
>> >
>> > _______________________________________________
>> > UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> > Corpora mailing list
>> > Corpora at uib.no
>> > http://mailman.uib.no/listinfo/corpora
>> >
>>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140214/c9b7350e/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list