[Corpora-List] How to do Japanese word segmentation using extra term list?
Minh Pham
minhpham0902 at gmail.com
Thu Oct 20 08:04:45 UTC 2011
Hi,
Could you please tell us exactly what input is and desired output is?
By the way, after installing mecab tool, in the command line, you can refer
the help of the tool by typing:
mecab.exe --help
The help is in English.
Best regards,
Pham
On Thu, Oct 20, 2011 at 4:22 PM, Adam Kilgarriff <adam at lexmasterclass.com>wrote:
> > However, since almost of the user manual is in Japanese, I can not
> understand completely.
>
> We have the same problem; are there any English versions anywhere
> (specially for mecab). Pointers and advice appreciated
>
> Adam
>
> On 20 October 2011 08:08, hf.jiang <hf.jiang at gmail.com> wrote:
>
>> Hi,all
>>
>> I'm currently trying to process Japanese texts.
>> Some friends suggest me use Chasen or Mecab.
>> However, since almost of the user manual is in Japanese, I can not
>> understand completely.
>> My expectation is that the segmentation tool can recognize the words
>> preferred to my term list.
>>
>> Note that I have not enough gold data for the training of the tools,
>> so, the off-the-shelf tool is better for me.
>>
>> Looking forward to your reply, thanks.
>>
>> -Hongfei Jiang
>>
>> _______________________________________________
>> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>>
>
>
> --
> ========================================
> Adam Kilgarriff <http://www.kilgarriff.co.uk/>
> adam at lexmasterclass.com
> Director Lexical Computing Ltd<http://www.sketchengine.co.uk/>
>
> Visiting Research Fellow University of Leeds<http://leeds.ac.uk>
>
> *Corpora for all* with the Sketch Engine <http://www.sketchengine.co.uk>
>
> *DANTE: a lexical database for English<http://www.webdante.com>
> *
> ========================================
>
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
--
Pham Quang Nhat Minh (Mr)
PhD student
NLP Laboratory - School of Information Science - JAIST
1-1 Asahidai, Nomi, 923-1292 Japan
Email: minhpqn at jaist.ac.jp
Web: http://www.jaist.ac.jp/index-e.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111020/93f0fc91/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list