[Corpora-List] English tokenizer

Kevin B. Cohen kevin.cohen at gmail.com
Thu Aug 16 15:34:50 UTC 2007


We have had good luck with Andrew Roberts's jTokeniser.  It has the
advantage of having very low overhead.

http://today.java.net/pub/n/jTokeniser1.2

Kev

On 8/16/07, ben dbabis samira <bendbabis_samira at yahoo.fr> wrote:
> Hi,
> I would be gratefull if you give me references of software (java
> implementation) that can tokenize text into sentences ( based not only on
> punctuation delimiters).
>
> Thanks for help
> Samira BEN DBABIS
> MIRACL Laboratory
> Sfax, TUNISIA
>
>  ________________________________
>  Ne gardez plus qu'une seule adresse mail ! Copiez vos mails vers Yahoo!
> Mail
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
K. B. Cohen
Biomedical Text Mining Group Lead
Center for Computational Pharmacology
303-724-7563 (office) 303-916-2417 (cell) 303-377-9194 (home)
http://compbio.uchsc.edu/Hunter_lab/Cohen

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list