<div>Thanks Pham.</div><div><br></div><div>I have found the solution.</div><div>The manual page (<a href="http://mecab.sourceforge.net/dic.html">http://mecab.sourceforge.net/dic.html</a>) includes what I need.</div><div>And I have asked one of my friend who knows Japanese to explain to me.</div><div><br></div><div>Wish my English be better, then I can supply colleagues an English version of the manual.</div><div><br></div><div>-Hongfei Jiang</div><div><includetail><div> </div><div> </div><div style="font:Verdana normal 14px;color:#000;"><div style="FONT-SIZE: 12px;FONT-FAMILY: Arial Narrow;padding:2px 0 2px 0;">------------------ Original ------------------</div><div style="FONT-SIZE: 12px;background:#efefef;padding:8px;"><div id="menu_sender"><b>From: </b> "Minh Pham"<minhpham0902@gmail.com>;</div><div><b>Date: </b> Thu, Oct 20, 2011 04:04 PM</div><div><b>To: </b> "Adam Kilgarriff"<adam@lexmasterclass.com>; <wbr></div><div><b>Cc: </b> "hf.jiang"<hf.jiang@gmail.com>; "corpora"<corpora@uib.no>; "Hiram Calvo"<hiramcalvo@gmail.com>; "Jan Pomikále"<xpomikal@fi.muni.cz>; <wbr></div><div><b>Subject: </b> Re: [Corpora-List] How to do Japanese word segmentation using extraterm list?</div></div><div> </div>Hi,<div><br></div><div>Could you please tell us exactly what input is and desired output is?</div><div><br></div><div>By the way, after installing mecab tool, in the command line, you can refer the help of the tool by typing:</div>
<div><br></div><div>mecab.exe --help</div><div><br></div><div>The help is in English.</div><div><br></div><div>Best regards,</div><div>Pham</div><div><br><div class="gmail_quote">On Thu, Oct 20, 2011 at 4:22 PM, Adam Kilgarriff <span dir="ltr"><<a href="mailto:adam@lexmasterclass.com">adam@lexmasterclass.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;"><div class="im">> However, since almost of the user manual is in Japanese, I can not understand completely.<div><br>
</div></div><div>We have the same problem; are there any English versions anywhere (specially for mecab). Pointers and advice appreciated</div>
<div><br></div><div>Adam<br><br><div class="gmail_quote"><div><div></div><div class="h5">On 20 October 2011 08:08, hf.jiang <span dir="ltr"><<a href="mailto:hf.jiang@gmail.com" target="_blank">hf.jiang@gmail.com</a>></span> wrote:<br>
</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div></div><div class="h5">
<div>Hi,all</div><div><br></div><div> I'm currently trying to process Japanese texts.</div><div> Some friends suggest me use Chasen or Mecab.</div><div> However, since almost of the user manual is in Japanese, I can not understand completely.</div>
<div> My expectation is that the segmentation tool can recognize the words preferred to my term list.</div><div> </div><div> Note that I have not enough gold data for the training of the tools, so, the off-the-shelf tool is better for me.</div>
<div><br></div><div> Looking forward to your reply, thanks.</div><div><br></div><div>-Hongfei Jiang</div><br></div></div>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no" target="_blank">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><font color="#888888"><br><br clear="all"><div><br></div>-- <br>========================================<br><a href="http://www.kilgarriff.co.uk/" target="_blank">Adam Kilgarriff</a> <a href="mailto:adam@lexmasterclass.com" target="_blank">adam@lexmasterclass.com</a> <br>
Director <a href="http://www.sketchengine.co.uk/" target="_blank">Lexical Computing Ltd</a> <br>Visiting Research Fellow <a href="http://leeds.ac.uk" target="_blank">University of Leeds</a> <div>
<i><font color="#006600">Corpora for all</font></i> with <a href="http://www.sketchengine.co.uk" target="_blank">the Sketch Engine</a> </div><div> <i><a href="http://www.webdante.com" target="_blank">DANTE: <font color="#009900">a lexical database for English</font></a><font color="#009900"> </font> </i><div>
========================================</div></div><br>
</font></div>
<br>_______________________________________________<br>
UNSUBSCRIBE from this page: <a href="http://mailman.uib.no/options/corpora" target="_blank">http://mailman.uib.no/options/corpora</a><br>
Corpora mailing list<br>
<a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>
<a href="http://mailman.uib.no/listinfo/corpora" target="_blank">http://mailman.uib.no/listinfo/corpora</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Pham Quang Nhat Minh (Mr)<br>PhD student<br>NLP Laboratory - School of Information Science - JAIST<br>1-1 Asahidai, Nomi, 923-1292 Japan<br>Email: <a href="mailto:minhpqn@jaist.ac.jp" target="_blank">minhpqn@jaist.ac.jp</a><br>
Web: <a href="http://www.jaist.ac.jp/index-e.html" target="_blank">http://www.jaist.ac.jp/index-e.html</a><br><br>
</div>
</div></includetail></div>