[Corpora-List] Phrase extraction

Helge Thomas Karset Hellerud helgetho at stud.ntnu.no
Mon Oct 24 18:47:43 UTC 2005


Hello,

PoS (Part of Speech) tagging is often used to extract phrases from text
(like Noun Phrases). But that approach assumes you have a PoS tagger
available. My document collection is in Norwegian, but I don't have a
Norwegian tagger.

1) Is there a way to create a simple PoS tagger to recognize verbs,
nouns and adjectives (in Norwegian)?

2) If not, do anyone have other approaches to extract phrases (like a
statistical approach?)

Thanks in advance.

Helge



More information about the Corpora mailing list