<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 11 (filtered medium)">
<style>
<!--
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-compose;
font-family:Arial;
color:windowtext;}
@page Section1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.Section1
{page:Section1;}
-->
</style>
<!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang=EN-US link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>I’m curious about the errors that can occur in sentence
boundary detection in Arabic and also in Mandarin. How do these errors
vary with training/testing on particular corpora (LDC, Web-derived or other). Pointers
to relevant reading would be appreciated, as would any other suggestions from
someone who has trained a detector on those languages. <o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'>Thank you,<o:p></o:p></span></font></p>
<p class=MsoNormal><font size=2 face=Arial><span style='font-size:10.0pt;
font-family:Arial'><o:p> </o:p></span></font></p>
<p class=MsoNormal><font size=2 face="Times New Roman"><span style='font-size:
10.0pt'>Eric Garbin</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face="Times New Roman"><span style='font-size:
10.0pt'>Computational Linguist</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face="Times New Roman"><span style='font-size:
10.0pt'>The Technology Development Group</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face="Times New Roman"><span style='font-size:
10.0pt'><a href="http://www.thetdgroup.com">www.thetdgroup.com</a></span></font><o:p></o:p></p>
<p class=MsoNormal><font size=2 face="Times New Roman"><span style='font-size:
10.0pt'>571-262-2693</span></font><o:p></o:p></p>
<p class=MsoNormal><font size=3 face="Times New Roman"><span style='font-size:
12.0pt'><o:p> </o:p></span></font></p>
</div>
</body>
</html>