[Corpora-List] sentence boundary detectors

Armin Schmidt armin.sch at gmail.com
Sat Feb 17 19:48:21 UTC 2007


Dear list,

I was wondering if you could point me to good sentence splitters for the
following languages: German, Russian, Spanish, English. It would be
great if they were stand-alone programs or modules for Python (Perl
would be ok, too ... although I'm already aware of the respective
CPAN-modules for English and German).

Since I do have corpora in all the above mentioned languages I would
also be very interested in available implementations (not papers) of any
unsupervised learning methods for detecting sentence boundaries (or
rather abbreviations).

Thanks,
Armin



More information about the Corpora mailing list