[Corpora-List] CRF++ parameter tuning

José Pablo González josepablog at gmail.com
Mon Oct 19 15:43:04 UTC 2009


The way you should do this is by cross-validating in a held out set :-)


On Thu, Oct 15, 2009 at 7:21 AM, Han-Cheol Cho <priancho at gmail.com> wrote:

> Dear researchers in NLP and related fields
>
> I received an e-mail about the sigma value tuning for CRF++.
> As far as I know, there is no tools or package that finds even sub-optimal
> C (actually sigma) value.
> However, some papers mentions that the final accuracy is not quite
> different even a sigma value differs as a factor of 10.
>
> Actually I am developing a NER tool for bio-domain.
> In my case, sigma values from 0.01, 0.1, 0.2, 0.5, 1, 2, 5, 10, 100, 1000
> When I trained and tested the models with NLPBA2004 shared task corpus,
> usually the models with 0.5, 1, 5 sigma values showed better performance.
> (Features are generated from word (both word identity and orthographic
> features), POS, shallow parsing), sigma value 0.5
>
> I think that training several models (5~10) could be enough to find a good
> sigma value.
>
> Sincerely yours,
>
> P.s If someone knows a better way, please let me know.
>
>
> On Thu, Oct 15, 2009 at 7:58 PM, Ahmed Ragab <ahmed.nabhan at gmail.com>wrote:
>
>> Dear Colleagues,
>>
>> Greetings,
>>
>> I am using CRF++ for Named Entity Recognition (NER) task and in the
>> documentation of CRF++ it is stated that we should set the hyper
>> parameter C to an appropriate value.
>> <quote>
>> This parameter trades the balance between overfitting and
>> underfitting. The results will significantly be influenced by this
>> parameter.
>> </quote>
>>
>> Is there any available tool (perhaps a perl script) to perform
>> parameter tuning of CRF++ on development set?
>>
>> Best wishes,
>> --
>> Ahmed Ragab Nabhan
>> Assistant Lecturer
>> Fayoum University - Egypt
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
>
> --
> Han-Cheol Cho
> Tsujii Lab., Graduate School of Information Science and Technology, The
> University of Tokyo
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20091019/2672ef0f/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list