[Corpora-List] Perl reader for Treebank parse trees?
Victor Kapustin
victor.kapustin at gmail.com
Sun Apr 16 06:38:34 UTC 2006
> -----Original Message-----
> From: owner-corpora at lists.uib.no
> [mailto:owner-corpora at lists.uib.no] On Behalf Of John Fry
> Sent: Sunday, April 16, 2006 9:25 AM
> To: Steven Bird
> Cc: Philip Resnik; CORPORA at uib.no
> Subject: Re: [Corpora-List] Perl reader for Treebank parse trees?
>
> "Steven Bird" <sb at csse.unimelb.edu.au> writes:
>
> > For those still wedded to Perl for NLP, consider the following Perl
> > program to find all words in a text ending in "ing". Note the
> > 'magic', the bits of syntax like <>, (split), my, $, =~,
> which reduces
> > readability:
> >
> > while (<>) {
> > foreach my $word (split) {
> > if ($word =~ /ing$/) {
> > print "$word\n";
> > }
> > }
> > }
> >
> > Here's the Python version, which contains far less magic:
> >
> > import sys
> > for line in sys.stdin.readlines():
> > for word in line.split():
> > if word.endswith('ing'):
> > print word
>
> #!/usr/bin/ruby
> puts scan(/\w+ing/) while gets
>
Taking punctuation into acount:
#!/usr/bin/perl
map {print "$_\n" } m/\b\w*ing\b/g while(<>) ;
Real magic!
More information about the Corpora
mailing list