[Corpora-List] GATE Processing Resources for Dutch
Martin Reynaert
reynaert at uvt.nl
Wed Oct 24 15:55:47 UTC 2012
Hi Ivelina,
At Tilburg University we have Frog (http://ilk.uvt.nl/frog/). It is the
de facto state-of-the-art tool for Dutch, for all the linguistic
analyses and annotations you require.
The new reference corpus (over 500 million words) of written Dutch -
SoNaR - (which will be free for research) has been linguistically
enriched by Frog. Frog now also does Named Entity Labeling, in fact.
It is not in Java, it is in C++, which enables far better performance.
These days, Frog comes packaged with Debian and Ubuntu.
Regards,
Martin
On 10/24/2012 05:18 PM, Diana Maynard wrote:
> Hi Ivelina
> In principle, you can incorporate pretty much any of these kind of
> resources into GATE (if you have the source for them) via a wrapper,
> though it's much easier if they're already Java-based. There's not
> much specifically for Dutch that comes already with GATE, but there
> are possibilities for adaptation of some of the existing GATE
> resources (and we've dabbled ourselves in a few bits and bobs). You'll
> most likely get better answers to the adaptation issue on the
> gate-users mailing list, which I see you've already contacted anyway.
> Regards
> Diana
>
>
> On 24/10/2012 14:25, Ivelina Nikolova wrote:
>> Dear Corpora Members,
>>
>> I'm looking for tools for processing Dutch which are suited for GATE.
>> Could you please help me with some suggestions for Sentence splitter,
>> POS, Chunker, Dependency Parser or any others which could be useful for
>> text analysis.
>>
>> Thanks in advance!
>>
>> Best,
>> Ivelina
>>
>>
>>
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list