[Corpora-List] How to do Japanese word segmentation using extra term list?

hf.jiang hf.jiang at gmail.com
Thu Oct 20 07:08:28 UTC 2011


Hi,all


    I'm currently trying to process Japanese texts.
    Some friends suggest me use Chasen or Mecab.
    However, since almost of the user manual is in Japanese, I can not understand completely.
    My expectation is that the segmentation tool can recognize the words preferred to my term list.
    
    Note that I have not enough gold data for the training of the tools,  so, the off-the-shelf tool is better for me.


    Looking forward to your reply, thanks.


-Hongfei Jiang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111020/9195fb67/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list