Fw: Re: [Corpora-List] computing semantic word similarity
Сергей Крылов
krylov-58 at mail.ru
Tue Nov 15 00:50:30 UTC 2005
-----Original Message-----
From: Сергей Крылов <krylov-58 at mail.ru>
To: Dimitar Blagoev <gefix at pu.acad.bg>
Date: Tue, 15 Nov 2005 03:48:23 +0300
Subject: Re: [Corpora-List] computing semantic word similarity
>
>
> I advise to use the STARLING.EXE.
> See http://starling.rinet.ru
> Let me quote a short piece from the STARHELP.DBF:
>
> _______________________
>
> SEMANTIC FUNCTIONS.
>
> All the semantic functions deal only with the English
> language and are of interest mostly for comparative
> linguists. They require the semantic database SENSE.DBF
> (plus SENSE.VAR) which must be located together with
> STARLING.EXE.
> SENSE.DBF is a collection of about 7000 English
> headwords described in terms of their semantic "attributes"
> or "constituents" (all in all around 400). All the data was
> extracted from existing etymological computer databases. A
> record like
>
> (HEADWORD) require (V) (ITEMS) to want;to search;to be;able
>
> means that in several cases the meaning "require" was
> associated with semantic "primitives" "to want", "to
> search", "to be" and "able".
>
> The functions now available are the following:
>
> SENSE(par_C, par_L)
>
> This function returns the common semantic constituent(s)
> of all the words in par_C - if any. Thus,
>
> SENSE("tree; bush") = "grass;root;tree"
>
> Note that SENSE("tree") is returned as "grass;leaf;root;
> tree;stick;forest" and SENSE("bush") is returned as "root;
> tree;thorn;grass;fruit".
>
> If a second logical parameter is passed as .T., the
> function SENSE returns all the semantic constituents of all
> the words constituting par_C (excluding articles,
> prepositions and some other "empty" words). Thus,
>
> SENSE("tree; bush", .T.) will return
>
> "grass;leaf;root;tree;stick;forest;thorn;grass;fruit".
>
> COMMON(par_C1,par_C2)
>
> This function is for commodity only and is fully
> equivalent to SENSE(par_C1+par_C2).
>
> SIMILAR(par_C1, par_C2, par_C3, par_C4)
>
> This is a complex function with four possible character
> arguments. The former two are compared on the basis of the
> SOUND function, while the latter two are compared on the
> basis of the SENSE function. The parameter par_C3 is
> supposed to be the meaning of par_C1, and the parameter
> par_C4 - the meaning of par_C2. Thus,
>
> SIMILAR("hound","Hund","hound","dog") = .T.
>
> SIMILAR("dog","Hund","dog","dog") = .F.
>
> If only the first two parameters are passed, they are
> compared merely by sound; if the first two parameters are
> empty, the last two are compared merely by meaning. Thus:
>
> SIMILAR("hound", "Hand") = .T.
> (while SIMILAR("hound","Hand","hound","hand") is of course
> .F.)
>
> SIMILAR("","","hound","dog") = .T.
> (while SIMILAR("","","hound","hand") is .F.)
>
> The function SIMILAR can now be automatically summoned
> while EDITING 2 FILES. The files are presently supposed to
> be standard etymological files with the fields PROTO and
> MEANING. Pressing Left Shift + F11 while editing one file
> will result in an automatic issuing of the Locate procedure
> equivalent to
>
> LOCATE FOR SIMILAR(FILE1->PROTO,FILE2->PROTO)
>
> Pressing Right Shift + F11 will result in issuing the
> Locate procedure equivalent to
>
> LOCATE FOR SIMILAR("","",FILE1->MEANING,FILE2->MEANING)
>
> Note that in this case only semantic matches are searched
> and the performance is generally slow.
>
> Finally, pressing Shift + F12 will summon the Locate
> procedure equivalent to
>
> LOCATE FOR SIMILAR(FILE1->PROTO, FILE2->PROTO,
> FILE1->MEANING, FILE2->MEANING)
>
> By pressing F4 you can continue search and browse
> through the whole second file looking for possible
> similarities.
>
>
> ________________
>
> Sincerely yours,
> Sergej A. Krylov
>
>
> -----Original Message-----
> From: "Dimitar Blagoev" <gefix at pu.acad.bg>
> To: "CORPORA" <CORPORA at UIB.NO>
> Date: Fri, 11 Nov 2005 22:28:43 +0200
> Subject: [Corpora-List] computing semantic word similarity
>
> >
> > Hello,
> >
> > Could you tell me of any methods/programs (besides distributional similarity) to compute the semantic similarity between two words in one language, but not only for english, for example I am interested if there are ways to do this also for french, german, spanish etc.
> >
> >
> > Best regards.
> >
> > Dimitar Blagoev
> > gefix at pu.acad.bg
> > 2005-11-11
> >
> >
>
>
More information about the Corpora
mailing list