[Corpora-List] Moving Lexical Semantics from Alchemy to Science
Rich Cooper
rich at englishlogickernel.com
Sat Jan 29 21:23:18 UTC 2011
Marco,
You wrote:
I fully agree with you and Katrin that the major challenge for our model
and its alternatives is to find convincing ways to evaluate whether it
learned what it purports to learn.
Best regards,
Marco
I can provide you with tools to make layered dictionaries of English
keywords if that helps you make vectors of the layered vocabularies. Would
that be useful for you?
-Rich
Sincerely,
Rich Cooper
EnglishLogicKernel.com
Rich AT EnglishLogicKernel DOT com
9 4 9 \ 5 2 5 - 5 7 1 2
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of
Marco Baroni
Sent: Friday, January 28, 2011 3:06 PM
To: Yorick Wilks
Cc: Corpora at uib.no; Krishnamurthy,Ramesh; Roberto Zamparelli
Subject: Re: [Corpora-List] Moving Lexical Semantics from Alchemy to Science
Dear Prof. Wilks,
I am one of the co-authors of the paper that Katrin kindly mentioned
(thanks, Katrin!).
Similar ideas are currently being explored by others, including Emiliano
Guevara, Daoud Clarke and colleagues, and Edward Grefenstette and
colleagues.
We are using a mathematical tool from the mid 19th century (matrices) in
order to apply intuitions from early seventies formal semantics
(Montague and others) to corpus-based semantic models that were
developed in the early nineties (LSA, HAL, ...), so we are not very posh
-- we are a tad musty, if anything.
We represent adjectives as matrices because they are a simple way to
encode a function from and onto vectors.
We are trying to capture, in "distributional semantics", the intuition
(expressed by Montague and many others) that adjectives are functions
that map nouns onto other nouns, where what the function does crucially
depends on the input noun (so that "rubber" -- seen as an adjective --
is a function that can have a different effect when it maps "ball" onto
"rubber ball" from the one it has when it maps "duck" onto "rubber duck").
Since nouns, in many corpus-based approaches, are represented as vectors
of co-occurrence counts with collocates (documents), we treat adjectives
as matrices that encode linear functions from and onto such vectors.
I am (partially) aware of the literature on Pathfinder and other earlier
literature on measuring word proximity, but it does not seem to me to
tackle the same challenge. We are using word/construction proximity to
evaluate our method, but the core of what the method does is building
larger constituents (adj+noun) from simpler ones (noun), which seems
like something different from what Pathfinder does (what little I know
of it).
I fully agree with you and Katrin that the major challenge for our model
and its alternatives is to find convincing ways to evaluate whether it
learned what it purports to learn.
Best regards,
Marco
--
Marco Baroni
Center for Mind/Brain Sciences (CIMeC)
University of Trento
http://clic.cimec.unitn.it/marco
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110129/5d141450/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list