[Corpora-List] sentence boundary detectors
Scott Songlin Piao
scott.piao at manchester.ac.uk
Mon Feb 19 12:31:40 UTC 2007
Hi Armin,
I put my English sentence splitor on the website:
http://text0.mib.man.ac.uk:8080/sentencebreaker/heuristic_tool
It is rule-based Java program and is downloadable.
Cheers
Scott Piao
----------------------------
Text Mining
School of Computer Science
The University of Manchester
UK
-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On Behalf Of Armin Schmidt
Sent: 17 February 2007 19:48
To: corpora at uib.no
Subject: [Corpora-List] sentence boundary detectors
Dear list,
I was wondering if you could point me to good sentence splitters for the
following languages: German, Russian, Spanish, English. It would be
great if they were stand-alone programs or modules for Python (Perl
would be ok, too ... although I'm already aware of the respective
CPAN-modules for English and German).
Since I do have corpora in all the above mentioned languages I would
also be very interested in available implementations (not papers) of any
unsupervised learning methods for detecting sentence boundaries (or
rather abbreviations).
Thanks,
Armin
More information about the Corpora
mailing list