Incorrect word segmenting for Chinese characters

Brian MacWhinney macw at cmu.edu
Tue Dec 13 19:00:36 UTC 2016


You have to add them manually.

--Brian

From: <chibolts at googlegroups.com> on behalf of Toh An <popsune1 at gmail.com>
Reply-To: "chibolts at googlegroups.com" <chibolts at googlegroups.com>
Date: Wednesday, December 14, 2016 at 2:28 AM
To: chibolts <chibolts at googlegroups.com>
Subject: Re: Incorrect word segmenting for Chinese characters

Dear Brian,

Thank you for the prompt reply! Is there an automated way of adding spaces, or do we have to add in spaces manually?

Toh

On Tuesday, December 13, 2016 at 5:20:10 PM UTC+8, Toh An wrote:

Hi, I have encountered a problem with Chinese data. Clan does not appear to segment Chinese sentences into word tokens correctly. Part of speech tagging is also affected. Attached is the clan output after running mlu and freq commands on a test file without mor tier (TestFileOutput), and the same test file with mor tier added (TestFileMor). Does anyone have any ideas how to resolve this? Thanks.
--
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com<mailto:chibolts+unsubscribe at googlegroups.com>.
To post to this group, send email to chibolts at googlegroups.com<mailto:chibolts at googlegroups.com>.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/647f189b-a76a-4e53-83cb-fe30d3acdb59%40googlegroups.com<https://groups.google.com/d/msgid/chibolts/647f189b-a76a-4e53-83cb-fe30d3acdb59%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups "chibolts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chibolts+unsubscribe at googlegroups.com.
To post to this group, send email to chibolts at googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chibolts/50ACA8BE-4BC8-4159-9E9D-73B13E9B9953%40andrew.cmu.edu.
For more options, visit https://groups.google.com/d/optout.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/chibolts/attachments/20161213/b7196e7a/attachment.htm>


More information about the Chibolts mailing list