[Corpora-List] Chinese sentence detector or splitter

Craig Pfeifer craig.pfeifer at gmail.com
Sun Apr 21 15:42:37 UTC 2013


The latest version of Stanford CoreNLP will process chinese text and
contains a sentence splitter.

If you have issues with the stanford tools you can send mail to the users
list : java-nlp-user at lists.stanford.edu

______________
craig.pfeifer at gmail.com


On Sun, Apr 21, 2013 at 4:34 AM, Xin Ying Qiu <xinying.qiu at gmail.com> wrote:

> Hello,
>
> I am processing Chinese reports which include phrases as title and
> subtitles as well as sentences ending with the period sign.  I want to
> extract the sentences ending with the period sign. But it is difficult to
> identify the beginning of such sentences as the document may contain
> stand-alone phrases and numbers.  It is not a document consisting of only
> sentences ending with period signs.  Are there any tools available to
> detect or split or extract Chinese sentence from a document?
>
> I've tried Stanford NLP document preprocess tool:
> edu.stanford.nlp.process.DocumentPreprocessor.  But it does not seem to
> work for my document.
>
> Thank you in advance for any advice and suggestions!
>
> Sincerely,
>
> Xin Ying
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130421/d8b6d8e5/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list