[Corpora-List] Penn Treebank annotated with chunks

Thomas Proisl tsproisl at linguistik.uni-erlangen.de
Mon Aug 27 08:43:22 UTC 2012


Hi Aleksandar,

there is a Perl script by Sabine Buchholz that can convert parsed
sentences into chunks. It has been used to generate the data for the
CoNLL-2000 Shared Task on chunking.

http://ilk.uvt.nl/team/sabine/chunklink/README.html

Best regards,
Thomas


Am Mon, 13 Aug 2012 13:52:08 +0100
schrieb Aleksandar Savkov <cytehuop at gmail.com>:

> Hello everybody,
> 
> I'm looking for a chunk-annotated version of the Penn Treebank. It
> seems to be the most popular resource for training and testing
> chunking software, but I haven't been able to find a chunked version
> or an algorithm for extracting chunks in a deterministic way. Is
> there a standard resource that everybody uses or does everybody just
> extract the chunks from the parsed data themselves?
> 
> Best,
> Aleksandar Savkov


-- 
Department Germanistik und Komparatistik
Professur für Computerlinguistik
Bismarckstr. 6, 91054 Erlangen

Institut für Anglistik und Amerikanistik
Lehrstuhl für Anglistik, insbesondere Linguistik
Bismarckstr. 1, 91054 Erlangen

Fon: +49 9131 85-25908; Fax: +49 9131 85-29251
http://www.linguistik.uni-erlangen.de/~tsproisl/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20120827/f9d95b84/attachment.sig>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list