[Corpora-List] How to do Japanese word segmentation using extra term list?
Adam Kilgarriff
adam at lexmasterclass.com
Thu Oct 20 07:22:02 UTC 2011
> However, since almost of the user manual is in Japanese, I can not
understand completely.
We have the same problem; are there any English versions anywhere (specially
for mecab). Pointers and advice appreciated
Adam
On 20 October 2011 08:08, hf.jiang <hf.jiang at gmail.com> wrote:
> Hi,all
>
> I'm currently trying to process Japanese texts.
> Some friends suggest me use Chasen or Mecab.
> However, since almost of the user manual is in Japanese, I can not
> understand completely.
> My expectation is that the segmentation tool can recognize the words
> preferred to my term list.
>
> Note that I have not enough gold data for the training of the tools,
> so, the off-the-shelf tool is better for me.
>
> Looking forward to your reply, thanks.
>
> -Hongfei Jiang
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
--
========================================
Adam Kilgarriff <http://www.kilgarriff.co.uk/>
adam at lexmasterclass.com
Director Lexical Computing
Ltd<http://www.sketchengine.co.uk/>
Visiting Research Fellow University of
Leeds<http://leeds.ac.uk>
*Corpora for all* with the Sketch Engine <http://www.sketchengine.co.uk>
*DANTE: a lexical database for
English<http://www.webdante.com>
*
========================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111020/c981e754/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list