[Corpora-List] Penn Treebank annotated with chunks
Thomas Proisl
tsproisl at linguistik.uni-erlangen.de
Mon Aug 27 08:43:22 UTC 2012
Hi Aleksandar,
there is a Perl script by Sabine Buchholz that can convert parsed
sentences into chunks. It has been used to generate the data for the
CoNLL-2000 Shared Task on chunking.
http://ilk.uvt.nl/team/sabine/chunklink/README.html
Best regards,
Thomas
Am Mon, 13 Aug 2012 13:52:08 +0100
schrieb Aleksandar Savkov <cytehuop at gmail.com>:
> Hello everybody,
>
> I'm looking for a chunk-annotated version of the Penn Treebank. It
> seems to be the most popular resource for training and testing
> chunking software, but I haven't been able to find a chunked version
> or an algorithm for extracting chunks in a deterministic way. Is
> there a standard resource that everybody uses or does everybody just
> extract the chunks from the parsed data themselves?
>
> Best,
> Aleksandar Savkov
--
Department Germanistik und Komparatistik
Professur für Computerlinguistik
Bismarckstr. 6, 91054 Erlangen
Institut für Anglistik und Amerikanistik
Lehrstuhl für Anglistik, insbesondere Linguistik
Bismarckstr. 1, 91054 Erlangen
Fon: +49 9131 85-25908; Fax: +49 9131 85-29251
http://www.linguistik.uni-erlangen.de/~tsproisl/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120827/f9d95b84/attachment.sig>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list