<div dir="ltr"><div><div><div><div>Thank you all for your most instructive comments and suggestions!<br><br></div>From a research point of view, Huang/Cheng (2011) is closest to the problem I'm dealing with. Kiss/Strunk (2006) is great in that the system works for eleven languages and for different text genres. Though Chinese is not tested by the system, one could learn a lot from their methodologies.<br>
<br></div>For my current task, I may ask the Stanford NLP users list for more advice. It could be that I have not searched the right archives or found the right tools.<br><br></div>Thanks again!<br><br></div>
Xin Ying<br><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Apr 21, 2013 at 11:42 PM, Craig Pfeifer <span dir="ltr"><<a href="mailto:craig.pfeifer@gmail.com" target="_blank">craig.pfeifer@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">The latest version of Stanford CoreNLP will process chinese text and contains a sentence splitter. <div>
<br></div><div>If you have issues with the stanford tools you can send mail to the users list : <a href="mailto:java-nlp-user@lists.stanford.edu" target="_blank">java-nlp-user@lists.stanford.edu</a></div>
</div><div class="gmail_extra"><br clear="all"><div>______________<br><a href="mailto:craig.pfeifer@gmail.com" target="_blank">craig.pfeifer@gmail.com</a></div>
<br><br><div class="gmail_quote"><div><div>On Sun, Apr 21, 2013 at 4:34 AM, Xin Ying Qiu <span dir="ltr"><<a href="mailto:xinying.qiu@gmail.com" target="_blank">xinying.qiu@gmail.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div>
<div dir="ltr"><div><div><div><div>Hello,<br><br></div>I am processing Chinese reports which include phrases as title and subtitles as well as sentences ending with the period sign. I want to extract the sentences ending with the period sign. But it is difficult to identify the beginning of such sentences as the document may contain stand-alone phrases and numbers. It is not a document consisting of only sentences ending with period signs. Are there any tools available to detect or split or extract Chinese sentence from a document? <br>
<br></div>I've tried Stanford NLP document preprocess tool: edu.stanford.nlp.process.DocumentPreprocessor. But it does not seem to work for my document. <br><br></div>Thank you in advance for any advice and suggestions!<br>
<br></div>Sincerely,<br><br>Xin Ying<br><br></div>
<br></div></div><div>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no" target="_blank">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></div></blockquote></div><br></div>
</blockquote></div><br></div></div>