[Corpora-List] Free Java Tokenizer for english
Alexandre Rafalovitch
arafalov at gmail.com
Thu Nov 20 17:00:26 UTC 2008
I don't believe there is a fully consistent agreement on tokenization
rules for English (e.g. "don't"), but have a look at:
http://www.andy-roberts.net/software/jTokeniser/
and
http://www.gate.ac.uk/
Regards,
Alex.
Personal blog: http://blog.outerthoughts.com/
Research group: http://www.clt.mq.edu.au/Research/
On Thu, Nov 20, 2008 at 11:41 AM, ben dbabis samira
<bendbabis_samira at yahoo.fr> wrote:
> Hi,
> Does anyone knows references of free tokenizers implemented with Java for
> english texts?
> Thanks for help
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list