[Corpora-List] Gimli: open source and high-performance biomedical name recognition

David Campos david.campos at ua.pt
Tue Feb 19 20:50:07 UTC 2013


David Campos, Sérgio Matos and José Luís Oliveira.
"Gimli: open source and high-performance biomedical name recognition."
BMC Bioinformatics, vol. 14, no. 1, p. 54, February 2013  
http://www.biomedcentral.com/1471-2105/14/54 (http://www.biomedcentral.com/1471-2105/14/54/abstract)
doi:10.1186/1471-2105-14-54

Abstract
========

Background
Automatic recognition of biomedical names is an essential task in biomedical information extraction, presenting several complex and unsolved challenges. In recent years, various solutions have been implemented to tackle this problem. However, limitations regarding system characteristics, customization and usability still hinder their wider application outside text mining research.

Results
We present Gimli, an open-source, state-of-the-art tool for automatic recognition of biomedical names. Gimli includes an extended set of implemented and user-selectable features, such as orthographic, morphological, linguistic-based, conjunctions and dictionary-based. A simple and fast method to combine different trained models is also provided. Gimli achieves an F-measure of 87.17% on GENETAG and 72.23% on JNLPBA corpus, significantly outperforming existing open-source solutions.

Conclusions
Gimli is an off-the-shelf, ready to use tool for named-entity recognition, providing trained and optimized models for recognition of biomedical entities from scientific text. It can be used as a command line tool, offering full functionality, including training of new models and customization of the feature set and model parameters through a configuration file. Advanced users can integrate Gimli in their text mining workflows through the provided library, and extend or adapt its functionalities. Based on the underlying system characteristics and functionality, both for final users and developers, and on the reported performance results, we believe that Gimli is a state-of-the-art solution for biomedical NER, contributing to faster and better research in the field. Gimli is freely available at http://bioinformatics.ua.pt/gimli.


--  
David Campos
Bioinformatics Group
IEETA, University of Aveiro
www.davidcampos.org (http://www.davidcampos.org/)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130219/d9104868/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list