[Corpora-List] Perl reader for Treebank parse trees?

Victor Kapustin victor.kapustin at gmail.com
Sun Apr 16 06:38:34 UTC 2006


 

> -----Original Message-----
> From: owner-corpora at lists.uib.no 
> [mailto:owner-corpora at lists.uib.no] On Behalf Of John Fry
> Sent: Sunday, April 16, 2006 9:25 AM
> To: Steven Bird
> Cc: Philip Resnik; CORPORA at uib.no
> Subject: Re: [Corpora-List] Perl reader for Treebank parse trees?
> 
> "Steven Bird" <sb at csse.unimelb.edu.au> writes:
> 
> > For those still wedded to Perl for NLP, consider the following Perl 
> > program to find all words in a text ending in "ing".  Note the 
> > 'magic', the bits of syntax like <>, (split), my, $, =~, 
> which reduces
> > readability:
> >
> >   while (<>) {
> >       foreach my $word (split) {
> >           if ($word =~ /ing$/) {
> >               print "$word\n";
> >           }
> >       }
> >   }
> >
> > Here's the Python version, which contains far less magic:
> >
> >   import sys
> >   for line in sys.stdin.readlines():
> >       for word in line.split():
> >           if word.endswith('ing'):
> >               print word
> 
> #!/usr/bin/ruby
> puts scan(/\w+ing/) while gets
> 
Taking punctuation into acount:
#!/usr/bin/perl
map {print "$_\n" } m/\b\w*ing\b/g while(<>) ;

Real magic!



More information about the Corpora mailing list