[Corpora-List] GATE Processing Resources for Dutch

Adam Funk a.funk at dcs.shef.ac.uk
Thu Nov 15 13:26:00 UTC 2012


[24/10/12 17:56] Diana Maynard wrote:

> However, if you particularly want it to work in GATE, it's possible
> we'll be integrating a newer version of OpenNLP into GATE shortly, which
> has models for Dutch. So that would be the simple solution, though I
> have no idea how good the Dutch components are.


We've done this now.  Below is a copy of the announcement from the
gate-users mailing list.

If you download the model files for Dutch and put them in the
models/dutch subdirectory of GATE's OpenNLP plugin, the sample
application should just work.

~~~~~

We've substantially updated GATE's OpenNLP plugin over the past few days
to use the latest version of the OpenNLP library and the current model
files.  This updated plugin is available from svn and in today's daily
snapshot, and will be included in the 7.1 release.


The plugin includes model files for English and sample applications
(gapp files) for English, Dutch, and German.  You need to download the
model files for all languages other than English, as documented in the
updated GATE user guide.

http://gate.ac.uk/sale/tao/splitch21.html#sec:misc-creole:opennlp

Models are available for Danish, German, English, Spanish, Dutch,
Portuguese, and Swedish, but not for all the tools in each language.
(The GATE PR supports the Maxent POS tagging models but not the
Perceptron ones.)

http://opennlp.sourceforge.net/models-1.5/


If you have annotated corpora, you can train your own models using the
OpenNLP training API outside of GATE, as described in the OpenNLP manual.

https://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html


Enjoy!


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list