[Corpora-List] Bootcamp: 'Quantitative Corpus Linguistics withR'--re Louw's endorsement

Yorick Wilks yorick at dcs.shef.ac.uk
Thu Aug 28 17:07:15 UTC 2008


Saying "computers are deterministic" really captures nothing since von- 
neumann-style machines can perfectly well have access to a
random number generator to make choices. I do try to stay out of this  
duologue you are having (honestly!) but the endless autodidact
philosophy of language stuff (i.e. about what/where is meaning, if  
anywhere?) does need to raise its game a bit. There are many  
straightforward tutorials on the basics of the philosophy of language:  
my own modest contribution (that does link philosophy directly to  
corpora/linguistics etc. which most tutorials dont ) is in "Electric  
Words: dictionaries, computers and meanings (MIT Press, 1996) by  
Guthrie, Slator and myself-----it's not really out of date because the  
basic issues dont change much. Sorry for the testy tone of this--put  
it down to age!
Yorick Wilks


On 28 Aug 2008, at 17:18, Linas Vepstas wrote:

> Hi,
>
> 2008/8/28 Wolfgang Teubert <w.teubert at bham.ac.uk>:
>>
>> You mention qualia,
>
> I had read something which said (to paraphrase) "a concept or lexis
> does not exist except as a negotiated meaning within a corpus",
> which can be dangerously misunderstood -- let me rephrase this
> as a question "Can meaning exist outside of the corpus of negotiated
> conversation?"   This has two obvious answers: "no", and "yes",
> both of which are correct, depending on what the meaning of "meaning"
> is. In the strict sense of corpus linguistics, the answer is "no":  
> there
> can be no meaning except that which is found in language.  And with
> this, I agree.
>
> Yet, it seems, we humans can talk about many things of non-linguistic
> origin, of which "qualia" was meant to be a primordial example. We
> map the  linguistic concept of "pain" onto that qualia we call "pain".
> The feeling itself exists outside of the discourse, its an object of
> discourse.  Thus, I conclude that meaning does not exist in a vacuum,
> but exists in reference to the thing being talked about.  What's more,
> that reference exists only inside of us, and not in the corpus itself.
>
> So, while I can run corpus tools on a corpus, and find many  
> interesting
> statistical correlations for the word "pain", is it correct to call  
> the
> sum-total of these statistical correlations "the meaning of pain"?
> Because, when you say:
>
>> Meaning is only in the discourse.
>
> that is what you seem to imply: that meaning is nothing more than
> the statistical vagaries of a text.  Given that there is an immense,
> a huge amount of structure in text, then, indeed, maybe indeed we
> can beleive that meaning is nothing more (and nothing less!!) than
> the strucutre of text.  But then, in your next sentence:
>
>> It is what is exchanged between and shared by people.
>
> Ahh! But how do I exchange and share my thoughts about "pain"?
> I personally draw upon a font of qualia, and this font shapes the
> words that I choose to use when talking about "pain".  When I learn
> a new language, I learn the "meaning" of its words; but having
> learned these, I bring my own experience to play when I use
> these words.
>
> It is along similar lines of reasoning that some AI folks now insist
> that intelligence can't be disembodied: No amount of statistical
> correlation taken from text will truly capture "meaning": one must
> attach the machine to sensors: sight, touch, movement; so that,
> when one says to it, "this is a table", it can see and touch it --  
> it too
> can attach "the meaning of table" that was "data-mined" from out
> of a large corpus, it can attach that meaning to that thing that is
> sensed.
>
> When I say "AI" I don't mean "human level intelligence". If you've
> followed the 2005 DARPA Grand Challenge, you know that we
> now have automobiles that can drive themselves, and its only
> cost and lawyers preventing them from going mass-market. If
> you follow the news from Iraq, you might know that some soldiers
> are now running around with verbal (voice-reco + speech-generation)
> hand-held machine-translation devices.  You don't need a lot of
> imagination to realize that you can hook up one of these
> self-driving cars to the voice unit.  We really aren't very many years
> away from being able to talk to our cars: "watch out for that table in
> the middle of the road!", followed by some confusion as to whether
> the mass detected by radar is a table or not.  Such a car would
> have an intelligence of (much) less than a human child, yet
> none-the-less, it would be a talking car.
>
> Should we collect a corpus of automotive utterances: things my
> machine said to me?  Should we run statistical tools on this corpus?
> Should we argue about what the machine meant, when it said "watch
> out for that table!"?  Should we argue about the "meaning of meaning"
> when we talk about the corpus of emulated English spoken by
> machines?
>
> Enough. Let me quibble a bit:
>
>> no machine that has intentionality.
>
> Well, the self-driving vehicle has the 'intent' of not hitting  
> anything
> as it intentionally moves from point A to point B.
>
>> The result of summarisation is predictable;
>
> But may still be surprising. If I ask my talking, self-driving machine
> to summarize the trip, it may well utter some unexpected remarks.
>
> In physics, there are dynamical systems exhibiting "chaotic
> behaviour": systems that are essentially unpredictable. These
> systems can be very simple: a few weights, a spring, a magnet.
> While being deterministic, (described by fixed mathematical
> equations of motion), they remain unpredictable.
>
> Now, computers are deterministic, doing whatever the
> software tells them to do, but, as dynamical systems, are
> far far more complex than a few springs and a magnet.
> While in principle a computer is "predictable", in practice
> they are not: they only seem predictable because the
> programmers have taken great pains to make sure that
> their software does not surprise the user.
>
>> All mechanical, rule-based ways for describing the meaning of  
>> 'table' (including the statistical devices developed by corpus  
>> linguists) cannot replace our collaborative interpretation of the  
>> word as it crops up in the discourse.
>
> Why not?  All due respect to humans, but, come the day we
> have self-driving cars (having an IQ of 2) that can spot tables
> in the middle of the road, and can talk about them, we seem to
> reach a crisis.
>
> --linas
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list