[Corpora-List] Moving Lexical Semantics from Alchemy to Science

Sat Jan 29 21:23:18 UTC 2011

Marco,

You wrote:

I fully agree with you and Katrin that the major challenge for our model 

and its alternatives is to find convincing ways to evaluate whether it 

learned what it purports to learn.

Best regards,

Marco

I can provide you with tools to make layered dictionaries of English
keywords if that helps you make vectors of the layered vocabularies.  Would
that be useful for you?

-Rich

Sincerely,

Rich Cooper

EnglishLogicKernel.com

Rich AT EnglishLogicKernel DOT com

9 4 9 \ 5 2 5 - 5 7 1 2

-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Marco Baroni
Sent: Friday, January 28, 2011 3:06 PM
To: Yorick Wilks
Cc: Corpora at uib.no; Krishnamurthy,Ramesh; Roberto Zamparelli
Subject: Re: [Corpora-List] Moving Lexical Semantics from Alchemy to Science

Dear Prof. Wilks,

I am one of the co-authors of the paper that Katrin kindly mentioned 

(thanks, Katrin!).

Similar ideas are currently being explored by others, including Emiliano 

Guevara, Daoud Clarke and colleagues, and Edward Grefenstette and 

colleagues.

We are using a mathematical tool from the mid 19th century (matrices) in 

order to apply intuitions from early seventies formal semantics 

(Montague and others) to corpus-based semantic models that were 

developed in the early nineties (LSA, HAL, ...), so we are not very posh 

-- we are a tad musty, if anything.

We represent adjectives as matrices because they are a simple way to 

encode a function from and onto vectors.

We are trying to capture, in "distributional semantics", the intuition 

(expressed by Montague and many others) that adjectives are functions 

that map nouns onto other nouns, where what the function does crucially 

depends on the input noun (so that "rubber" -- seen as an adjective -- 

is a function that can have a different effect when it maps "ball" onto 

"rubber ball" from the one it has when it maps "duck" onto "rubber duck").

Since nouns, in many corpus-based approaches, are represented as vectors 

of co-occurrence counts with collocates (documents), we treat adjectives 

as matrices that encode linear functions from and onto such vectors.

I am (partially) aware of the literature on Pathfinder and other earlier 

literature on measuring word proximity, but it does not seem to me to 

tackle the same challenge. We are using word/construction proximity to 

evaluate our method, but the core of what the method does is building 

larger constituents (adj+noun) from simpler ones (noun), which seems 

like something different from what Pathfinder does (what little I know 

of it).

I fully agree with you and Katrin that the major challenge for our model 

and its alternatives is to find convincing ways to evaluate whether it 

learned what it purports to learn.

Best regards,

Marco

-- 

Marco Baroni

Center for Mind/Brain Sciences (CIMeC)

University of Trento

http://clic.cimec.unitn.it/marco

_______________________________________________

Corpora mailing list

Corpora at uib.no

http://mailman.uib.no/listinfo/corpora

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110129/5d141450/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora