[Corpora-List] Moving Lexical Semantics from Alchemy to Science

Fri Jan 28 20:25:02 UTC 2011

Im sure you can get something useful from word proximity computed over large corpora; to me the issues are what mechanisms best do that and what you can get out of it. theres a whole tradition of corpus based clustering leading to clumps or graphs, from Karen Sparck Jones thesis (1960s!) to the Pathfinder algorithms that gave nice weighted graphs *** in the 80s. I dont quite see whats new unless vectors and matrices add a lot to those simpler methods--until proved wrong Im prepared to bet they dont--it's just that people have forgotten them as they forget everything, and "matrix" sounds mathematically posher (though in fact Speck Jones did try to compute matrices but EDSAC2 was too small to cope at Cambridge in those days).
BUT, I just cant see that what youre describing leads to helpful paraphrases---listing out the "interpretations" of nearby words wont help will it.?
YW

***
McDonald, J. E., Plate, T. A., & Schvaneveldt, R. W. (1990). Using Pathfinder to extract semantic information from text. In R. Schvaneveldt (Ed.), Pathfinder associative networks: Studies in knowledge organization. (pp. 149-164). Norwood, NJ: Ablex.

> 

On 28 Jan 2011, at 14:59, Katrin Erk wrote:

>> OK but the problem is that you have to know what it is you are looking for closeness TO. Why would you seek relative proximity to
>> toy and food unless you already knew those were the words corresponding to or capturing the ambiguity--in other words you already have to know what the choices are (and in the case of "rubber chicken" it has both senses and proximity in a space cannot show ambiguity, can it?). Theres nothing about toys in any o the three components rubber/duck/chicken surely? Cohen and Margalit were asking one right question which is whether and how one could determine combination meaning from component-meanings--I dont see how the proximity analysis you cite can do that.
> 
> Baroni and Zamparelli actually don't use predefined words (like toy)
> to compare to, but determine the nearest neighbors of an expression in
> space. Then you can interpret each word through its nearest neighbors.
> So the interpretation of each expression is a list of paraphrases.
> 
> But I think you mean more than that by "determinining combination
> meaning". Something along the lines of being able to list all
> appropriate inferences to draw?
> 
> Assuming you mean that, my answer would be: Yes, in the end that's the
> goal, but for now this goal is too large. For now we need goals that
> work for the intermediate, smaller steps. And I think models like this
> one are on the right track because they can use corpus data to predict
> the meaning of words and expressions in context.
> 
> Katrin
> 
> 
>> Y
>> 
>> On 28 Jan 2011, at 14:41, Katrin Erk wrote:
>> 
>>> On Fri, Jan 28, 2011 at 1:32 PM, Yorick Wilks <Y.Wilks at dcs.shef.ac.uk> wrote:
>>>> Hmmm...not quite sure what "doing the right thing" for the rubber duck and chicken would be. Surely no method like this could provide a representation so what could it give at best?
>>> 
>>> It does provide a representation, the question is just how to
>>> interpret it and draw inferences from it. The most straightforward way
>>> is by testing closeness of the representation of the adj+noun pair to
>>> the representations of other expressions, say "toy" and "food". If the
>>> model gets rubber duck and chicken right, then it should predict (by
>>> measuring similarity/proximity in semantic space) that rubber duck is
>>> closer to "toy" than "food", and the other way round for the rubber
>>> chicken.
>>> 
>>> Katrin
>>> 
>>>> 
>>>> On 28 Jan 2011, at 14:25, Katrin Erk wrote:
>>>> 
>>>>> Hi all,
>>>>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20110128/509cb61f/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora