Corpora: code for random selection of concordance lines

Bruce L. Lambert, Ph.D. lambertb at uic.edu
Fri Mar 22 22:46:08 UTC 2002


I don't want to start a language war here, but I am quite partial to gawk,
sed and the other GNU text utilities:
http://www.gnu.org/manual/textutils-2.0/textutils.html . For my money, they
are the best free text tools available anywhere on any platform. Learn
these tools, and you'll rarely need anything else. I also am a big advocate
of Lisp as a text processing language for rapid prototyping. See, e.g.
http://www.lisp.org/table/contents.htm

Of course, perl, python, etc. are all fine (and free) as well.

-bruce

At 09:20 AM 3/22/2002 -0500, Sean Slattery wrote:

>I'd like to second Rosie's implicit point, learn to wield Perl. It
>will really accelerate your progress when dealing with text.
>
>As a nice intro, Rosie's YAPC talk on natural language processing and
>perl is as good a starting place as any:
>
>http://www-2.cs.cmu.edu/~rosie/yapc/
>
>S.



More information about the Corpora mailing list