[Corpora-List] Summary of Suggested Food Resources

Craig Pfeifer craig.pfeifer at gmail.com
Fri Jul 11 14:59:41 UTC 2014


I received an overwhelming number of responses! I summarize them here for
the list, as they were presented:

The Unified Medical Language System Metathesaurus
http://www.nlm.nih.gov/research/umls/ has semantic type food:
http://www.nlm.nih.gov/research/umls/META3_current_semantic_types.html

 the LA Times did a writeup of scraping together a database from their
archive of recipes using NLTK

http://datadesk.latimes.com/posts/2013/12/natural-language-processing-in-the-kitchen

Carnegie Mellon University Recipe Database
http://www.ark.cs.cmu.edu/CURD/

Teng et al (2012) created a large dataset of recipes by crawling
allrecipes.com: http://arxiv.org/pdf/1111.3919

IBM's recipe generation project is based in part on NLP analysis of
food resources,
but I'm not sure whether these resources are described in detail anywhere:
http://spectrum.ieee.org/computing/software/creating-
recipes-with-artificial-intelligence
http://arxiv.org/pdf/1311.1213v1

Malmaud et al's (2014) ACL paper "Cooking with Semantics"
http://www.cs.ubc.ca/~murphyk/Papers/acl2014.pdf is not a "resource" of the
sort you're seeking but, similar to Teng et al it points to a corpus-driven
approach to inducing some of the things you're looking for—just the sort of
thing we ought to recommend here on corpora-list. Even without the CURD
database that Diarmuid links, a verb in the executable part of a recipe has
a good chance of being a "process", and a noun phrase is likely to be an
"ingredient".

Data from Chahuneau et al. (2012): http://victor.chahuneau.fr/pub/menus/data
(Dan Jurafsky may also have data:
http://web.stanford.edu/~jurafsky/foodpubs.html)

I believe some work at U.Saarland covered food and recipes, and generated
resources in a variety of modalities; see e.g.
http://www.coli.uni-saarland.de/~regneri/docs/TACoS.pdf .



______________
craig.pfeifer at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140711/b3b5b423/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list