<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=us-ascii"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;
        mso-fareast-language:EN-US;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=EN-GB link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>I’ve used Mallet (<a href="http://mallet.cs.umass.edu/">http://mallet.cs.umass.edu/</a>). Very easy to install and use. It took around 4 hours to load a 1700 document, 113 m word collection on a 2.27 GHz linux machine, with java set to use 4Gb memory. Then training a naive Bayes classifier and classifying documents with it is pretty well instant.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Hope this is enough information for you to tell if it meets your needs.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Pete Whitelock, PhD</span><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><br></span><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Principal Language Engineer, Technology<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Academic Dictionaries</span><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'> <br></span><span style='font-size:10.0pt;font-family:"Courier New";color:#1F497D'>Oxford University Press</span><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><p class=MsoNormal><b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span lang=EN-US style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> corpora-bounces@uib.no [mailto:corpora-bounces@uib.no] <b>On Behalf Of </b>Md. Hasanuzzaman<br><b>Sent:</b> 13 September 2012 11:17<br><b>To:</b> corpora@uib.no; mt-list@eamt.org<br><b>Subject:</b> [Corpora-List] Naive Bayes Classifier tool<o:p></o:p></span></p><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal style='margin-bottom:12.0pt'>Kindly suggest any Classification Tool for Naive Bayes Classifier (other than Weka) that efficiently handles String data.<br clear=all><br>-- <br><b><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:#330033'>Md.Hasanuzzaman</span></b><br><b><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:#330033'>Senior Research Engineer</span></b><br><b><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:#330033'>Department of Computer Science and Engineering</span></b><br><b><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:#330033'>Jadavpur University</span></b><br><b><span style='font-size:10.0pt;font-family:"Arial","sans-serif";color:#330033'>Kolkata-700 032<br><br></span></b><o:p></o:p></p></div>
<P>Oxford University Press (UK) Disclaimer</P>
<P>This message is confidential. You should not copy it or disclose its contents 
to anyone. You may use and apply the information for the intended purpose only. 
OUP does not accept legal responsibility for the contents of this message. Any 
views or opinions presented are those of the author only and not of OUP. If this 
email has come to you in error, please delete it, along with any attachments. 
Please note that OUP may intercept incoming and outgoing email 
communications.</P>
</body></html>