<html>


<head>

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=us-ascii">


<meta name=Generator content="Microsoft Word 10 (filtered)">


<style>

<!--

 /* Font Definitions */

 @font-face

        {font-family:SimSun;

        panose-1:2 1 6 0 3 1 1 1 1 1;}

@font-face

        {font-family:Tahoma;

        panose-1:2 11 6 4 3 5 4 4 2 4;}

@font-face

        {font-family:"\@SimSun";

        panose-1:2 1 6 0 3 1 1 1 1 1;}

 /* Style Definitions */

 p.MsoNormal, li.MsoNormal, div.MsoNormal

        {margin:0cm;

        margin-bottom:.0001pt;

        font-size:12.0pt;

        font-family:"Times New Roman";}

a:link, span.MsoHyperlink

        {color:blue;

        text-decoration:underline;}

a:visited, span.MsoHyperlinkFollowed

        {color:purple;

        text-decoration:underline;}

span.EmailStyle17

        {font-family:Arial;

        color:navy;}

@page Section1

        {size:612.0pt 792.0pt;

        margin:72.0pt 90.0pt 72.0pt 90.0pt;}

div.Section1

        {page:Section1;}

-->

</style>


</head>


<body lang=EN-GB link=blue vlink=purple>


<div class=Section1>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Stefano,</span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>This area has blossomed in recent years

and there is ample work on the question.</span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Greg Grefenstette explored it in detail in

his thesis and associated book (<a

href="http://portal.acm.org/citation.cfm?coll=GUIDE&dl=GUIDE&id=527911">Explorations

in Automatic Thesaurus Discovery</a>, Kluwer, 1994).  Dekang Lin

introduced a new measure which has been adopted by quite a few people (including

myself) in his COLING 1998 paper.  Lillian Lee compared various measures

in her thesis, see her papers in Proc ACL 1999.  Since 2003, two excellent

theses on the question are by Julie Weeds (Sussex Univ) and James Curran

(Edinburgh Univ).  Both of them are authors and co-authors on various papers

further exploring the topic – see e.g., Weeds and Weir in the latest CL,

31 (4) 2005.  Geffet and Dagan (COLING 2004) is another thought-provoking

paper.  In ACL-COLING 2006, Gorman and Curran move on to the next question:

what are the computational issues about producing thesauruses from very large (billion+

word) corpora.</span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Regards,</span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'>Adam</span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>


<p class=MsoNormal><font size=2 color=navy face=Arial><span style='font-size:

10.0pt;font-family:Arial;color:navy'> </span></font></p>


<p class=MsoNormal style='margin-left:36.0pt'><font size=2 face=Tahoma><span

lang=EN-US style='font-size:10.0pt;font-family:Tahoma'>-----Original

Message-----<br>

<b><span style='font-weight:bold'>From:</span></b> owner-corpora@lists.uib.no

[mailto:owner-corpora@lists.uib.no] <b><span style='font-weight:bold'>On Behalf

Of </span></b>Stefano Vegnaduzzo<br>

<b><span style='font-weight:bold'>Sent:</span></b> 28 June 2006 05:28<br>

<b><span style='font-weight:bold'>To:</span></b> CORPORA@UIB.NO<br>

<b><span style='font-weight:bold'>Subject:</span></b> [Corpora-List] mutual

similarity</span></font></p>


<p class=MsoNormal style='margin-left:36.0pt'><font size=3

face="Times New Roman"><span style='font-size:12.0pt'> </span></font></p>


<p class=MsoNormal style='margin-left:36.0pt'><font size=2

face="Times New Roman"><span style='font-size:10.0pt'>Dear all,<br>

<br>

I would like to ask for pointers/literature/references/etc on the topic of

mutual (or reciprocal) similarity. Here is what I mean by this:<br>

<br>

Given a term t0 and a set of terms t1 ... tn, a similarity measure M typically

allows you to rank the terms t1 & tn according to their similarity to t0.<br>

<br>

My question: Given a term t0 and a set of terms t1 ... tn, and a similarity

measure M, and assuming a non-symmetric similarity relation (i.e., M(t1,t2) is

different from M(t2,t1), how do you compute the mutual similarity MS of t0 with

respect to each term t1 ... tn, where M(t0,ti) is different from M(ti,t0). In

other words, I am interested in computing and ranking the mutual similarity of

all pairs MS(t0,ti), where MS(t0,ti) is some function of M(t0,ti) and M(ti,t0).<br>

<br>

Cases of interest are for example those where M(t0,tX) is a bit higher than

M(t0,tY) but M(tY,t0) is much higher than M(tX,t0), so I would like a mutual

similarity measure to capture this by assigning MS(t0,ty) a higher score than

MS(t0,tx)<br>

<br>

I found very limited references in the literature. For example D. Hindle. Noun

classification from predicate-argument structures (1990) defines reciprocal

similarity as the case where two terms are each other's most similar term, but

this is way too restrictive for what I am interested in.<br>

<br>

Any help will be appreciated,<br>

thanks,<br>

<br>

Stefano Vegnaduzzo<br>

 <br>

<br>

 </span></font> </p>


</div>


</body>


</html>