[Corpora-List] CRF++ parameter tuning

Han-Cheol Cho priancho at gmail.com
Thu Oct 15 11:21:46 UTC 2009


Dear researchers in NLP and related fields

I received an e-mail about the sigma value tuning for CRF++.
As far as I know, there is no tools or package that finds even sub-optimal C
(actually sigma) value.
However, some papers mentions that the final accuracy is not quite different
even a sigma value differs as a factor of 10.

Actually I am developing a NER tool for bio-domain.
In my case, sigma values from 0.01, 0.1, 0.2, 0.5, 1, 2, 5, 10, 100, 1000
When I trained and tested the models with NLPBA2004 shared task corpus,
usually the models with 0.5, 1, 5 sigma values showed better performance.
(Features are generated from word (both word identity and orthographic
features), POS, shallow parsing), sigma value 0.5

I think that training several models (5~10) could be enough to find a good
sigma value.

Sincerely yours,

P.s If someone knows a better way, please let me know.


On Thu, Oct 15, 2009 at 7:58 PM, Ahmed Ragab <ahmed.nabhan at gmail.com> wrote:

> Dear Colleagues,
>
> Greetings,
>
> I am using CRF++ for Named Entity Recognition (NER) task and in the
> documentation of CRF++ it is stated that we should set the hyper
> parameter C to an appropriate value.
> <quote>
> This parameter trades the balance between overfitting and
> underfitting. The results will significantly be influenced by this
> parameter.
> </quote>
>
> Is there any available tool (perhaps a perl script) to perform
> parameter tuning of CRF++ on development set?
>
> Best wishes,
> --
> Ahmed Ragab Nabhan
> Assistant Lecturer
> Fayoum University - Egypt
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



-- 
Han-Cheol Cho
Tsujii Lab., Graduate School of Information Science and Technology, The
University of Tokyo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091015/07b2d4f5/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list