[Corpora-List] GATE Processing Resources for Dutch

G.J.M.Noord, van g.j.m.van.noord at rug.nl
Wed Oct 24 16:31:28 UTC 2012


Ivelina,


never trust a tool that does "all .. that you require".


If you need a parser for Dutch, use Alpino. Why bother with
a POS-tagger or a chunker if you can get a full syntactic analysis
of all the words and phrases of your input?


In fact, the dependency analysis of Frog is trained on the output
of Alpino. The accuracy of Frog for labeled dependencies is reported
to be 76%.  The accuracy of Alpino for labeled dependencies is
over 90%.


So much for "de facto state-of-the-art tool for Dutch".


http://www.let.rug.nl/vannoord/alp/Alpino/




and, no, neither Frog nor Alpino is ready for Gate afaik...


GJ


On 24-10-12, Martin Reynaert   wrote:
> Hi Ivelina,
> 
> At Tilburg University we have Frog (http://ilk.uvt.nl/frog/). It is the de facto state-of-the-art tool for Dutch, for all the linguistic analyses and annotations you require.
> 
> The new reference corpus (over 500 million words) of written Dutch - SoNaR - (which will be free for research) has been linguistically enriched by Frog.  Frog now also does Named Entity Labeling, in fact.
> 
> It is not in Java, it is in C++,  which enables far better performance.
> 
> These days, Frog comes packaged with Debian and Ubuntu.
> 
> Regards,
> 
> Martin
> 
> On 10/24/2012 05:18 PM, Diana Maynard wrote:
> >Hi Ivelina
> >In principle, you can incorporate pretty much any of these kind of resources into GATE (if you have the source for them) via a wrapper, though it's much easier if they're already Java-based. There's not much specifically for Dutch that comes already with GATE, but there are possibilities for adaptation of some of the existing GATE resources (and we've dabbled ourselves in a few bits and bobs). You'll most likely get better answers to the adaptation issue on the gate-users mailing list, which I see you've already contacted anyway.
> >Regards
> >Diana
> >
> >
> >On 24/10/2012 14:25, Ivelina Nikolova wrote:
> >>Dear Corpora Members,
> >>
> >>I'm looking for tools for processing Dutch which are suited for GATE.
> >>Could you please help me with some suggestions for Sentence splitter,
> >>POS, Chunker, Dependency Parser or any others which could be useful for
> >>text analysis.
> >>
> >>Thanks in advance!
> >>
> >>Best,
> >>Ivelina
> >>
> >>
> >>
> >
> >
> >_______________________________________________
> >UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> >Corpora mailing list
> >Corpora at uib.no
> >http://mailman.uib.no/listinfo/corpora
> 
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list