[Corpora-List] Sentence boundary detection
Mehmet Kayaalp
Mehmet.Kayaalp at nih.gov
Tue Jul 24 16:15:40 UTC 2007
Last year, we examined 13 open source, freeware software packages, which can
perform NL tokenization (many of which perform sentence boundary detection
and more) and summarized our experience in a technical report, which is
accessible at http://lhncbc.nlm.nih.gov/lhc/docs/reports/2006/tr2006003.pdf.
Best,
--mehmet
Mehmet Kayaalp
Lister Hill National Center for Biomedical Communications
Building 38A
National Institutes of Health
8600 Rockville Pike MSC-3828
Bethesda, MD 20894
(301) 451-4633, Fax: (301) 402-0118
Mehmet.Kayaalp at nih.gov
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Kelly Vincent
Sent: Friday, July 20, 2007 10:11 AM
To: corpora at uib.no
Subject: [Corpora-List] Sentence boundary detection
I am interested in what the current state-of-the-art is in sentence boundary
detection and (to a lesser degree) tokenization. I have been able to locate
several articles, but very few that are quite recent. I would appreciate any
pointers to particularly important papers or to available tools, as well as
the community's thoughts on the topic.
We are building a Spanish corpus so I am particularly interested in these
topics from the Spanish perspective, though not confined to that.
Regards,
Kelly Vincent
Software Engineer
MetaMetrics, Inc.
_________________________________________________________________
Local listings, incredible imagery, and driving directions - all in one
place! http://maps.live.com/?wip=69&FORM=MGAC01
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list