[Corpora-List] software semantic similarity between texts

Antonio Toral antonio.toral at ilc.cnr.it
Mon Oct 20 14:56:57 UTC 2008


Thanks for the suggestion Scott.

I've tried from that website the "Pairwise Comparison" and it seems to do what 
I need; given some short texts I can get the "similarity" between each pair.

e.g. for sentences about "waterways" (glosses, definitions):

sentence1: "a navigable body of water"
sentence2: "a conduit through which water flows"
sentence3: "A waterway is any navigable body of water.  These include rivers, 
lakes, oceans, and canals."

I get:

pairwise_comparison (sentence1, sentence2) = 0.78
pairwise_comparison (sentence1, sentence3) = 0.85
pairwise_comparison (sentence2, sentence3) = 0.82


However, in that website I find only on-line demos, whereas I'd need some 
software that I can download and integrate into a system. Do you know of any 
downloadable LSA package?

Regards,
Antonio



> Latent Semantic Analysis should do the trick. There are a variety of tools
> on the website that should help you out.
>
> http://lsa.colorado.edu/
>
> Scott Crossley, Ph.D.
> Linguistics/TESOL
>
> Department of English
> Mississippi State University
> http://www.msstate.edu/dept/english/tesol/tesolfaculty.html
> (662) 325-2355
>
> Institute for Intelligent Systems
> University of Memphis
> http://mnemosyne.csl.psyc.memphis.edu/iis/
>
>

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list