[Corpora-List] Chinese sentence detector or splitter
Xin Ying Qiu
xinying.qiu at gmail.com
Sun Apr 21 08:34:45 UTC 2013
Hello,
I am processing Chinese reports which include phrases as title and
subtitles as well as sentences ending with the period sign. I want to
extract the sentences ending with the period sign. But it is difficult to
identify the beginning of such sentences as the document may contain
stand-alone phrases and numbers. It is not a document consisting of only
sentences ending with period signs. Are there any tools available to
detect or split or extract Chinese sentence from a document?
I've tried Stanford NLP document preprocess tool:
edu.stanford.nlp.process.DocumentPreprocessor. But it does not seem to
work for my document.
Thank you in advance for any advice and suggestions!
Sincerely,
Xin Ying
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130421/a7fc0ee1/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list