[Corpora-List] Moving Lexical Semantics from Alchemy to Science

Katrin Erk katrin.erk at mail.utexas.edu
Fri Jan 28 19:25:48 UTC 2011


Hi all,

On Fri, Jan 28, 2011 at 10:42 AM, Yorick Wilks <Y.Wilks at dcs.shef.ac.uk> wrote:
> This discussion has been going on in Ai and linguistics (and in philosophy a bit) for at least 40 years and Im worrying now that we arent making progress: if there was any justice corpora would help here!  In 1972 Cohen and Margalit discussed what properties you could predict of a "rubber duck" from its components--i.e. that were different from a  regular old duck: their claim, if I remember right, was that you couldnt make any.

In fact, there has recently been a paper that uses corpus evidence on
related cases, namely adjective-noun pairs:

Marco Baroni; Roberto Zamparelli
Nouns are Vectors, Adjectives are Matrices: Representing
Adjective-Noun Constructions in Semantic Space. Proceedings of EMNLP
2010, http://aclweb.org/anthology/D/D10/D10-1115.pdf

They use the contexts in which the adj/noun pairs are found to learn a
representation for each adjective that maps noun vectors to new
vectors for the adj/noun pair. Their approach should do the right
thing for "rubber duck"/"rubber chicken", or at least it should be
able to if "rubber" is reasonably frequent in the corpus.

Best,
Katrin


Later AI people weighed in for a couple of decades --including
me--arguing that, well, with some reasonable assumptions about the
state of the world you could make some reasonable predictions in at
least some cases. Though this would, inevitably, be dependent on an
individual's knowledge state as well--it is not just a matter of  some
objective linguistic base or widely shared knowledge---and this is how
poets work, as we all know. I wrote on this with colleagues in 1991
under the title "Your metaphor or mine?". But those were still
pre-corpus days, by and large, so we must have moved on a bit from
examples now, no? I worked with a student a few years ago on
extracting novel compounds from very large web corpora e.g. hardly
present in say 1995 but much represented in 2000--there was an
interesting, and related, set of examples that emerged but I couldnt
see any way to publish them so as to make any claims.
> Examples are more fun than computing, of course, and Im still obsessed with things like "rubber duck" (in the bath) doesnt go the same way as "rubber chicken" (banquet food, as well as being a comedy prop)--I suppose enough facts about the distribution of meats at banquets might make this predictable, but Im not confident.
> Yorick Wilks
>
>
>
>
>
> On 28 Jan 2011, at 10:53, Dominic Widdows wrote:
>
>> On Fri, Jan 28, 2011 at 10:36 AM,  <amsler at cs.utexas.edu> wrote:
>>> Technically, yes; but what I think makes a truly interesting combination is
>>> when the alternate meaning arises accidentally to serve a necessary purpose.
>>
>> Technically yes, but in practice, no - compounds have a well-known
>> property of (usually) only taking on some of the available meanings.
>>
>> There's some good literature on this, but being a parent of small
>> children my favourite by a long way is the song "When I see and
>> elephant fly."
>>
>> http://lyricsplayground.com/alpha/songs/w/wheniseeanelephantfly.shtml
>>
>> Best wishes,
>> Dominic
>>
>>> The reason 'solar system' is interesting is that I don't think the people
>>> who coined it were intentionally trying to be funny. Their domain used
>>> 'solar' in a whole array (sorry) of compounds consistent with only one
>>> meaning until they accidentally coined one compound that collided with the
>>> other meaning.
>>>
>>> I suppose one could distinguish between 'the solar system' and 'a solar
>>> system' (at least until recently, when astronomers started looking for
>>> extra-solar planets), but what I'm trying to say is that the ambiguous ones
>>> I'm most interested in are those that came about via evolutionary processes
>>> and somehow managed to both get established thus demonstrating two
>>> decompositional principles that are sustainable within the language.
>>>
>>> The fragility of these combinations is obvious as they violate a fundamental
>>> principle of discourse, i.e., being clear as to what one means. The BBC
>>> examples are excellent because they are 'real'. One should force the other
>>> out of existence once the perception of the ambiguity dawns on most people.
>>> Either that or force the addition of words for clarification, as in
>>> 'astronomical solar system' vs. 'solar energy system'.
>>>
>>> Quoting "Krishnamurthy, Ramesh" <r.krishnamurthy at aston.ac.uk>:
>>>
>>>> Hi all
>>>>
>>>> a) Surely any multi-word item involving at least one polysemous  element
>>>> would be a candidate?
>>>> e.g. civil service [service = an act or an organization]
>>>>
>>>> b) Or indeed, any pair of words, as they have the potential to  engage in
>>>> a variety of case relationships?
>>>> e.g. walking stick
>>>>
>>>>
>>>>
>>>> c) Then there's the problem of segmentation/sequence, i.e. "(a+b) +  c" or
>>>> "a + (b+c)"?
>>>>
>>>> e.g. hot water tap
>>>>
>>>> Best
>>>> Ramesh Krishnamurthy
>>>> Lecturer in English Studies, School of Languages and Social Sciences,
>>>> Aston University, Birmingham B4 7ET, UK
>>>> Tel: +44 (0)121-204-3812 ; Fax: +44 (0)121-204-3766 [Room NX08, 10th
>>>> Floor, North Wing of Main Building]
>>>> http://www1.aston.ac.uk/lss/staff/krishnamurthyr/
>>>> Director, ACORN (Aston Corpus Network project): http://acorn.aston.ac.uk/
>>>>
>>>>
>>>>
>>>> Message: 6
>>>>
>>>> Date: Fri, 28 Jan 2011 10:43:45 +0000
>>>>
>>>> From: Justin Washtell <lec3jrw at leeds.ac.uk<mailto:lec3jrw at leeds.ac.uk>>
>>>>
>>>> Subject: Re: [Corpora-List] Moving Lexical Semantics from Alchemy to
>>>>
>>>>      Science
>>>>
>>>> To: David Wible <wible at stringnet.org<mailto:wible at stringnet.org>>,  John
>>>> Williams
>>>>
>>>>      <j0hnwh0ever.corpora at gmail.com<mailto:j0hnwh0ever.corpora at gmail.com>>
>>>>
>>>> Cc: "Corpora at uib.no<mailto:Corpora at uib.no>"
>>>>  <Corpora at uib.no<mailto:Corpora at uib.no>>
>>>>
>>>>
>>>>
>>>> Ancient history teachers.
>>>>
>>>> Or, a little tenuously, comprehensive ancient history teachers.
>>>>
>>>>
>>>>
>>>> Justin Washtell
>>>>
>>>> University of Leeds
>>>>
>>>> ________________________________________
>>>>
>>>> From: corpora-bounces at uib.no<mailto:corpora-bounces at uib.no>
>>>>  [corpora-bounces at uib.no] On Behalf Of David Wible  [wible at stringnet.org]
>>>>
>>>> Sent: 28 January 2011 09:17
>>>>
>>>> To: John Williams
>>>>
>>>> Cc: Corpora at uib.no<mailto:Corpora at uib.no>
>>>>
>>>> Subject: Re: [Corpora-List] Moving Lexical Semantics from Alchemy to
>>>> Science
>>>>
>>>>
>>>>
>>>> How about 'heavy metal fans'?
>>>>
>>>>
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Thu, Jan 27, 2011 at 7:57 PM, John Williams
>>>>  <j0hnwh0ever.corpora at gmail.com<mailto:j0hnwh0ever.corpora at gmail.com<mailto:j0hnwh0ever.corpora at gmail.com%3cmailto:j0hnwh0ever.corpora at gmail.com>>>
>>>>  wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ... P.S. Anyone have some other ambiguous open compounds they are
>>>>  familiar with, besides 'solar system'?
>>>>
>>>>
>>>>
>>>> 'golf club' springs to mind
>>>>
>>>>
>>>>
>>>> j0hn
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> -----------
>>>>
>>>>
>>>>
>>>> John Williams
>>>>
>>>> Lecturer in English Language and Linguistics
>>>>
>>>> School of Languages and Area Studies
>>>>
>>>> PK 2.18, University of Portsmouth
>>>>
>>>> Portsmouth PO1 2DZ
>>>>
>>>> Tel: (0239 284) 2162
>>>>
>>>> Email:
>>>>  john.x.williams at port.ac.uk<mailto:john.x.williams at port.ac.uk<mailto:john.x.williams at port.ac.uk%3cmailto:john.x.williams at port.ac.uk>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Corpora mailing list
>>> Corpora at uib.no
>>> http://mailman.uib.no/listinfo/corpora
>>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>



-- 
Katrin Erk, Department of Linguistics
The University of Texas at Austin
http://comp.ling.utexas.edu/people/katrin_erk

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list