I'd like to second Rosie's implicit point, learn to wield Perl. It
will really accelerate your progress when dealing with text.

As a nice intro, Rosie's YAPC talk on natural language processing and
perl is as good a starting place as any:


