25.3316, FYI: New Dataset for Semantic Similarity Measurements

The LINGUIST List linguist at linguistlist.org
Tue Aug 19 22:03:40 UTC 2014

LINGUIST List: Vol-25-3316. Tue Aug 19 2014. ISSN: 1069 - 4875.

Subject: 25.3316, FYI: New Dataset for Semantic Similarity Measurements

Moderators: Damir Cavar, Indiana U <damir at linguistlist.org>
            Malgorzata E. Cavar, Indiana U <gosia at linguistlist.org>

Reviews: reviews at linguistlist.org
Anthony Aristar <aristar at linguistlist.org>
Helen Aristar-Dry <hdry at linguistlist.org>
Mateja Schuck, U of Wisconsin Madison

Homepage: http://linguistlist.org

Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from

USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21

For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.

Editor for this issue: Uliana Kazagasheva <uliana at linguistlist.org>

Date: Tue, 19 Aug 2014 18:03:02
From: Felix Hill [felix.hill at cl.cam.ac.uk]
Subject: New Dataset for Semantic Similarity Measurements

E-mail this message to a friend:
We have just published a new dataset of 999 concept pairs rated by 500
annotators for semantic similarity (beer, ale), as distinct from relatedness
(beer, drink).

It is intended to provide a challenging benchmark for the evaluation of
representation and embedding-learning language models. It should also be of
interest to psycholinguistics and cognitive scientists interested in
representation and conceptual concreteness.

For more information, and to download the dataset, visit:

Please cite the following paper if you use the dataset in your research:
Hill, F. Reichart, R. Korhonen, A. SimLex-999: Evaluating Semantic Models with
(Genuine) Similarity Estimation. 2014. Preprint published on arXiv.

Linguistic Field(s): Cognitive Science
                     Computational Linguistics


LINGUIST List: Vol-25-3316	


More information about the Linguist mailing list