[Corpora-List] software for measuring semantic similarity and relatedness?

Torsten Zesch zesch at ukp.informatik.tu-darmstadt.de
Mon Oct 7 20:16:42 UTC 2013


DKPro Similarity
http://code.google.com/p/dkpro-similarity-asl/
http://code.google.com/p/dkpro-similarity-gpl/

It is a comprehensive repository of similarity measures.
It covers most of the classic path- and gloss-based measures which can be used based on WordNet, Wiktionary, Wikipedia, and OpenThesaurus.

It also comes with complete experimental frameworks for Word Similarity experiments 
http://code.google.com/p/dkpro-similarity-asl/wiki/WordPairSimilarity
and Word Choice / TOEFL Synonym experiments.

-Torsten

> -----Original Message-----
> From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf
> Of Ted Pedersen
> Sent: Sunday, October 06, 2013 5:46 PM
> To: corpora at uib.no; wn-similarity at yahoogroups.com
> Subject: [Corpora-List] software for measuring semantic similarity and
> relatedness?
> 
> Greetings all,
> 
> I'm preparing a tutorial on measuring semantic similarity and relatedness
> between concepts, My particular focus is on methods that do this using
> ontologies or other (at least somewhat) structured resources (like Wikipedia,
> folksonomies, etc.) and that also have freely available software associated
> with them (or at least a web demo).
> 
> While it's a very interesting area, this particular tutorial won't include purely
> distributional approaches (due to time constraints), so I'm looking for
> methods and software that use some sort of resource like WordNet,
> Wikipedia, medical ontologies, Freebase, etc. to arrive at measurements of
> semantic similarity or relatedness between pairs of concepts.
> 
> What follows is my current list, based not only on projects I have heard of but
> have used in the not too distant past - so I guess I'm particularly interested in
> projects you have used or created yourself (and can therefore vouch for to
> some extent).
> 
> Based on WordNet, provide path, depth, info content based measures, may
> include relatedness measures like lesk, vector, hso
> 
> WordNet::Similarity
> http://wn-similarity.sourcforge.net
> 
> NLTK
> http://nltk.org
> 
> ws4j
> https://code.google.com/p/ws4j/
> 
> Based on UMLS (Unified Medical Language System), provide path, depth,
> info content measures, includes relatedness measures lesk, vector
> 
> UMLS::Similarity
> http://umls-similarity.sourceforge.net
> 
> Based on (GO), provide path, depth, and info content measures
> 
> Proteinon
> http://lasige.di.fc.ul.pt/webtools/proteinon/
> 
> I will post a summary of whatever I hear about after some period of time.
> Any hints or suggestions will be very gratefully received.
> 
> Many thanks,
> Ted
> 
> --
> Ted Pedersen
> http://www.d.umn.edu/~tpederse
> 
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list