[Corpora-List] Perl reader for Treebank parse trees?

Yannick Versley versley at sfs.uni-tuebingen.de
Tue Apr 18 07:45:42 UTC 2006


Dear Philip,

> Does anyone have a convenient perl subroutine or module that will
> convert Treebank parse trees into internal perl data structures?  I've
> done a bit of Web searching looking for combinations of things like
> "perl", "s-expression", "sexpr", etc. with no luck, but I'm thinking
> such a thing must be out there....
I know of (at least) two perl programs/modules where you could try to extract 
the needed functionality:
the first is Sabine Buchholz' chunklink.pl program, available under
http://ilk.uvt.nl/~sabine/chunklink/chunklink_2-2-2000_for_conll.pl
(I think that taking the start_read and read_sentence subroutines together 
with the terminal, non_terminal and trace classes - don't let yourself be 
fooled by the 'package' keyword, this is perl and they're classes - should 
suffice),
the second is the penn2negra.pl script from Michael Daum's DepSy (Dependency 
Synthesizer), which uses a Parse::RecDescent parser and is part of the 
software available at
http://nats-www.informatik.uni-hamburg.de/view/Papa/PapaDownloads
(I've attached it here for convenience, otherwise it's in the utils/ directory 
in the tarballs).

For the record, I use python for my needs, although not NLTK but some modules 
I built from scratch.

Best Regards,
Yannick Versley

-- 
Yannick Versley
Seminar für Sprachwissenschaft, Abt. Computerlinguistik
Wilhelmstr. 19, 72074 Tübingen
Tel.: (07071) 29 77352
-------------- next part --------------
A non-text attachment was scrubbed...
Name: penn2negra.pl
Type: application/x-perl
Size: 7228 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20060418/7b9d03cc/attachment-0001.bin>


More information about the Corpora mailing list