<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Title" content="">
<meta name="Keywords" content="">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
p
        {mso-style-priority:99;
        mso-margin-top-alt:auto;
        margin-right:0in;
        mso-margin-bottom-alt:auto;
        margin-left:0in;
        font-size:12.0pt;
        font-family:"Times New Roman";}
span.EmailStyle18
        {mso-style-type:personal-reply;
        font-family:Calibri;
        color:windowtext;}
span.msoIns
        {mso-style-type:export-only;
        mso-style-name:"";
        text-decoration:underline;
        color:teal;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style>
</head>
<body bgcolor="white" lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">Dear Toh An,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">     You have to separate Chinese words with spaces.  Here is a version that gets analyzed correctly after you put in the spaces.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri">-- Brian MacWhinney<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:Calibri"><o:p> </o:p></span></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-family:Calibri;color:black">From: </span>
</b><span style="font-family:Calibri;color:black"><chibolts@googlegroups.com> on behalf of Toh An <popsune1@gmail.com><br>
<b>Reply-To: </b>"chibolts@googlegroups.com" <chibolts@googlegroups.com><br>
<b>Date: </b>Tuesday, December 13, 2016 at 5:20 PM<br>
<b>To: </b>chibolts <chibolts@googlegroups.com><br>
<b>Subject: </b>Incorrect word segmenting for Chinese characters<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p><span style="font-family:Calibri;color:black">Hi, I have encountered a problem with Chinese data. Clan does not appear to segment Chinese sentences into word tokens correctly. Part of speech tagging is also affected. Attached is the clan output after running
 mlu and freq commands on a test file without mor tier (TestFileOutput), and the same test file with mor tier added (TestFileMor). Does anyone have any ideas how to resolve this? Thanks.</span><o:p></o:p></p>
</div>
<p class="MsoNormal">-- <br>
You received this message because you are subscribed to the Google Groups "chibolts" group.<br>
To unsubscribe from this group and stop receiving emails from it, send an email to
<a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br>
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br>
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/f1080fe4-4646-4e24-bd2f-7b6b753d7c54%40googlegroups.com?utm_medium=email&utm_source=footer">
https://groups.google.com/d/msgid/chibolts/f1080fe4-4646-4e24-bd2f-7b6b753d7c54%40googlegroups.com</a>.<br>
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br>
<br>
<o:p></o:p></p>
</div>
</body>
</html>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups "chibolts" group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:chibolts+unsubscribe@googlegroups.com">chibolts+unsubscribe@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:chibolts@googlegroups.com">chibolts@googlegroups.com</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/chibolts/4FFD79F6-FFDD-4187-841A-0109FCBAB9CD%40andrew.cmu.edu?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/chibolts/4FFD79F6-FFDD-4187-841A-0109FCBAB9CD%40andrew.cmu.edu</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />