<html>
<head>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 10 (filtered)">
<style>
<!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"\@SimSun";
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman";}
a:link, span.MsoHyperlink
{color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{color:purple;
text-decoration:underline;}
span.EmailStyle17
{font-family:Arial;
color:navy;}
@page Section1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
{page:Section1;}
-->
</style>
</head>
<body lang=EN-GB link=blue vlink=purple>
<div class=Section1>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Stefano,</span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>This area has blossomed in recent years
and there is ample work on the question.</span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Greg Grefenstette explored it in detail in
his thesis and associated book (<a
href="http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=527911">Explorations
in Automatic Thesaurus Discovery</a>, Kluwer, 1994). Dekang Lin
introduced a new measure which has been adopted by quite a few people (including
myself) in his COLING 1998 paper. Lillian Lee compared various measures
in her thesis, see her papers in Proc ACL 1999. Since 2003, two excellent
theses on the question are by Julie Weeds (Sussex Univ) and James Curran
(Edinburgh Univ). Both of them are authors and co-authors on various papers
further exploring the topic – see e.g., Weeds and Weir in the latest CL,
31 (4) 2005. Geffet and Dagan (COLING 2004) is another thought-provoking
paper. In ACL-COLING 2006, Gorman and Curran move on to the next question:
what are the computational issues about producing thesauruses from very large (billion+
word) corpora.</span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Regards,</span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'>Adam</span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:
10.0pt;font-family:Arial;color:navy'> </span></font></p>
<p class=MsoNormal style='margin-left:36.0pt'><font size=2 face=Tahoma><span
lang=EN-US style='font-size:10.0pt;font-family:Tahoma'>-----Original
Message-----<br>
<b><span style='font-weight:bold'>From:</span></b> owner-corpora@lists.uib.no
[mailto:owner-corpora@lists.uib.no] <b><span style='font-weight:bold'>On Behalf
Of </span></b>Stefano Vegnaduzzo<br>
<b><span style='font-weight:bold'>Sent:</span></b> 28 June 2006 05:28<br>
<b><span style='font-weight:bold'>To:</span></b> CORPORA@UIB.NO<br>
<b><span style='font-weight:bold'>Subject:</span></b> [Corpora-List] mutual
similarity</span></font></p>
<p class=MsoNormal style='margin-left:36.0pt'><font size=3
face="Times New Roman"><span style='font-size:12.0pt'> </span></font></p>
<p class=MsoNormal style='margin-left:36.0pt'><font size=2
face="Times New Roman"><span style='font-size:10.0pt'>Dear all,<br>
<br>
I would like to ask for pointers/literature/references/etc on the topic of
mutual (or reciprocal) similarity. Here is what I mean by this:<br>
<br>
Given a term t0 and a set of terms t1 ... tn, a similarity measure M typically
allows you to rank the terms t1 & tn according to their similarity to t0.<br>
<br>
My question: Given a term t0 and a set of terms t1 ... tn, and a similarity
measure M, and assuming a non-symmetric similarity relation (i.e., M(t1,t2) is
different from M(t2,t1), how do you compute the mutual similarity MS of t0 with
respect to each term t1 ... tn, where M(t0,ti) is different from M(ti,t0). In
other words, I am interested in computing and ranking the mutual similarity of
all pairs MS(t0,ti), where MS(t0,ti) is some function of M(t0,ti) and M(ti,t0).<br>
<br>
Cases of interest are for example those where M(t0,tX) is a bit higher than
M(t0,tY) but M(tY,t0) is much higher than M(tX,t0), so I would like a mutual
similarity measure to capture this by assigning MS(t0,ty) a higher score than
MS(t0,tx)<br>
<br>
I found very limited references in the literature. For example D. Hindle. Noun
classification from predicate-argument structures (1990) defines reciprocal
similarity as the case where two terms are each other's most similar term, but
this is way too restrictive for what I am interested in.<br>
<br>
Any help will be appreciated,<br>
thanks,<br>
<br>
Stefano Vegnaduzzo<br>
<br>
<br>
</span></font> </p>
</div>
</body>
</html>