Fw: Re: [Corpora-List] computing semantic word similarity

Сергей Крылов krylov-58 at mail.ru
Tue Nov 15 00:50:30 UTC 2005


-----Original Message-----
From: Сергей Крылов <krylov-58 at mail.ru>
To: Dimitar Blagoev <gefix at pu.acad.bg>
Date: Tue, 15 Nov 2005 03:48:23 +0300
Subject: Re: [Corpora-List] computing semantic word similarity

> 
> 
> I advise to use the STARLING.EXE.
> See http://starling.rinet.ru
> Let me quote a short piece from the STARHELP.DBF:
> 
> _______________________
>   
>   SEMANTIC FUNCTIONS.
> 
>     All  the  semantic functions deal only with the  English
> language    and  are  of interest  mostly  for   comparative
> linguists.  They  require  the semantic  database  SENSE.DBF
> (plus   SENSE.VAR)  which  must be  located  together   with
> STARLING.EXE.
>     SENSE.DBF    is  a  collection of  about  7000   English
> headwords  described in terms of their semantic "attributes"
> or  "constituents" (all in all around 400). All the data was
> extracted  from existing etymological computer databases.  A
> record like
> 
> (HEADWORD) require (V) (ITEMS) to want;to search;to be;able
> 
> means  that  in  several  cases the  meaning  "require"  was
> associated   with  semantic  "primitives"  "to  want",   "to
> search", "to be" and "able".
> 
>     The functions now available are the following:
> 
> SENSE(par_C, par_L)
> 
>     This function returns the common semantic constituent(s)
> of all the words in par_C - if any. Thus,
> 
>      SENSE("tree; bush") = "grass;root;tree"
> 
>      Note that SENSE("tree") is returned as "grass;leaf;root;
> tree;stick;forest" and  SENSE("bush") is returned  as  "root; 
> tree;thorn;grass;fruit".
> 
>      If  a  second logical parameter is passed as  .T.,  the
> function  SENSE returns all the semantic constituents of all
> the    words    constituting par_C   (excluding    articles,
> prepositions and some other "empty" words). Thus,
> 
>      SENSE("tree; bush", .T.) will return
> 
> "grass;leaf;root;tree;stick;forest;thorn;grass;fruit".
> 
> COMMON(par_C1,par_C2)
> 
>     This  function  is  for  commodity  only  and  is  fully
> equivalent to SENSE(par_C1+par_C2).
> 
> SIMILAR(par_C1, par_C2, par_C3, par_C4)
> 
>     This  is a complex function with four possible character
> arguments.  The former two are compared on the basis of  the
> SOUND  function,  while the latter two are compared  on  the
> basis  of  the  SENSE  function.  The  parameter  par_C3  is
> supposed  to  be  the meaning of par_C1, and  the  parameter
> par_C4 - the meaning of par_C2. Thus,
> 
>    SIMILAR("hound","Hund","hound","dog") = .T.
> 
>    SIMILAR("dog","Hund","dog","dog") = .F.
> 
>     If  only  the first two parameters are passed, they  are
> compared  merely  by sound; if the first two parameters  are
> empty, the last two are compared merely by meaning. Thus:
> 
>    SIMILAR("hound", "Hand") = .T.
> (while SIMILAR("hound","Hand","hound","hand") is of course
> .F.)
> 
>    SIMILAR("","","hound","dog") = .T.
> (while SIMILAR("","","hound","hand") is .F.)
> 
>     The  function SIMILAR can now be automatically  summoned
> while  EDITING 2 FILES. The files are presently supposed  to
> be  standard  etymological files with the fields  PROTO  and
> MEANING.  Pressing  Left Shift + F11 while editing one  file
> will  result in an automatic issuing of the Locate procedure
> equivalent to
> 
>    LOCATE FOR SIMILAR(FILE1->PROTO,FILE2->PROTO)
> 
>     Pressing  Right  Shift + F11 will result in issuing  the
> Locate procedure equivalent to
> 
>    LOCATE FOR SIMILAR("","",FILE1->MEANING,FILE2->MEANING)
> 
>    Note that in this case only semantic matches are searched
> and the performance is generally slow.
> 
>     Finally,  pressing  Shift + F12 will summon  the  Locate
> procedure equivalent to
> 
>     LOCATE FOR SIMILAR(FILE1->PROTO, FILE2->PROTO,
>        FILE1->MEANING, FILE2->MEANING)
> 
>     By  pressing  F4  you  can continue  search  and  browse
> through    the  whole  second file  looking  for    possible
> similarities.
> 
> 
> ________________
> 
> Sincerely yours,
> Sergej A. Krylov
> 
> 
> -----Original Message-----
> From: "Dimitar Blagoev" <gefix at pu.acad.bg>
> To: "CORPORA" <CORPORA at UIB.NO>
> Date: Fri, 11 Nov 2005 22:28:43 +0200
> Subject: [Corpora-List] computing semantic word similarity
> 
> > 
> > Hello,
> > 
> > Could you tell me of any methods/programs (besides distributional similarity) to compute the semantic similarity between two words in one language, but not only for english, for example I am interested if there are ways to do this also for french, german, spanish etc.
> > 
> > 
> > Best regards. 
> > 
> > Dimitar Blagoev
> > gefix at pu.acad.bg
> > 2005-11-11
> > 
> > 
> 
> 



More information about the Corpora mailing list