[Corpora-List] NaCTeM Metabolite and Enzyme corpus

Paul Thompson Paul.Thompson at manchester.ac.uk
Fri Nov 16 14:21:04 UTC 2012


Recently, the field of systems biology has begun to model and simulate metabolic networks, requiring knowledge of the set of molecules involved. While genomics and proteomics technologies are able to supply the macromolecular parts list, the metabolites are less easily assembled. Most metabolites are known and reported through the scientific literature, rather than through large-scale experimental surveys. Thus, it is important to recover them from the literature.

We are pleased to announce the availability of the NaCTeM Metabolite and Enzyme corpus: http://www.nactem.ac.uk/metabolite-corpus/

The corpus is intended to act as a means to train text mining systems to recognise metabolites and enzymes. It consists of 296 MEDLINE abstracts that have been manually annotated by domain experts.

The following paper provides more details about the corpus and a system trained to recognise metabolites automatically:

Nobata, C., Dobson, P., Iqbal, S. A., Mendes, P., Tsujii, J., Kell, D. B. and Ananiadou, S. (2011). Mining Metabolites: Extracting the Yeast Metabolome from the Literature. Metabolomics, 7(1), 94-101. (Available at: http://www.springerlink.com/content/e1727327007hx663/)

--------

Paul Thompson
Research Associate
School of Computer Science
National Centre for Text Mining
Manchester Institute of Biotechnology
University of Manchester
131 Princess Street
Manchester
M1 7DN
UK
Tel: 0161 306 3091
http://personalpages.manchester.ac.uk/staff/Paul.Thompson/





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20121116/1eb244db/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list