[Corpora-List] Phrase extraction

Marco Baroni baroni at sslmit.unibo.it
Mon Oct 24 19:22:52 UTC 2005


Hi there.

Regarding the first option (creating a tagger for Norwegian):

Perhaps this is obvious, but if you are willing to assign tags to a
certain number of documents (say, about 15000 words) by hand, then you can
"train" a part of specch tagger, e.g., one or more of the acopost taggers
(http://sourceforge.net/projects/acopost/). Or, you could try to contact
somebody who already did that (just look for information on annotated
Norwegian corpora on the Web), and see if they can let you use their
tagger, or at least let you train a tagger on their annotated data...

Regards,

Marco



More information about the Corpora mailing list