[Corpora-List] Bootcamp: 'Quantitative Corpus Linguistics withR'--re Louw's endorsement
Yorick Wilks
yorick at dcs.shef.ac.uk
Thu Aug 28 17:07:15 UTC 2008
Saying "computers are deterministic" really captures nothing since von-
neumann-style machines can perfectly well have access to a
random number generator to make choices. I do try to stay out of this
duologue you are having (honestly!) but the endless autodidact
philosophy of language stuff (i.e. about what/where is meaning, if
anywhere?) does need to raise its game a bit. There are many
straightforward tutorials on the basics of the philosophy of language:
my own modest contribution (that does link philosophy directly to
corpora/linguistics etc. which most tutorials dont ) is in "Electric
Words: dictionaries, computers and meanings (MIT Press, 1996) by
Guthrie, Slator and myself-----it's not really out of date because the
basic issues dont change much. Sorry for the testy tone of this--put
it down to age!
Yorick Wilks
On 28 Aug 2008, at 17:18, Linas Vepstas wrote:
> Hi,
>
> 2008/8/28 Wolfgang Teubert <w.teubert at bham.ac.uk>:
>>
>> You mention qualia,
>
> I had read something which said (to paraphrase) "a concept or lexis
> does not exist except as a negotiated meaning within a corpus",
> which can be dangerously misunderstood -- let me rephrase this
> as a question "Can meaning exist outside of the corpus of negotiated
> conversation?" This has two obvious answers: "no", and "yes",
> both of which are correct, depending on what the meaning of "meaning"
> is. In the strict sense of corpus linguistics, the answer is "no":
> there
> can be no meaning except that which is found in language. And with
> this, I agree.
>
> Yet, it seems, we humans can talk about many things of non-linguistic
> origin, of which "qualia" was meant to be a primordial example. We
> map the linguistic concept of "pain" onto that qualia we call "pain".
> The feeling itself exists outside of the discourse, its an object of
> discourse. Thus, I conclude that meaning does not exist in a vacuum,
> but exists in reference to the thing being talked about. What's more,
> that reference exists only inside of us, and not in the corpus itself.
>
> So, while I can run corpus tools on a corpus, and find many
> interesting
> statistical correlations for the word "pain", is it correct to call
> the
> sum-total of these statistical correlations "the meaning of pain"?
> Because, when you say:
>
>> Meaning is only in the discourse.
>
> that is what you seem to imply: that meaning is nothing more than
> the statistical vagaries of a text. Given that there is an immense,
> a huge amount of structure in text, then, indeed, maybe indeed we
> can beleive that meaning is nothing more (and nothing less!!) than
> the strucutre of text. But then, in your next sentence:
>
>> It is what is exchanged between and shared by people.
>
> Ahh! But how do I exchange and share my thoughts about "pain"?
> I personally draw upon a font of qualia, and this font shapes the
> words that I choose to use when talking about "pain". When I learn
> a new language, I learn the "meaning" of its words; but having
> learned these, I bring my own experience to play when I use
> these words.
>
> It is along similar lines of reasoning that some AI folks now insist
> that intelligence can't be disembodied: No amount of statistical
> correlation taken from text will truly capture "meaning": one must
> attach the machine to sensors: sight, touch, movement; so that,
> when one says to it, "this is a table", it can see and touch it --
> it too
> can attach "the meaning of table" that was "data-mined" from out
> of a large corpus, it can attach that meaning to that thing that is
> sensed.
>
> When I say "AI" I don't mean "human level intelligence". If you've
> followed the 2005 DARPA Grand Challenge, you know that we
> now have automobiles that can drive themselves, and its only
> cost and lawyers preventing them from going mass-market. If
> you follow the news from Iraq, you might know that some soldiers
> are now running around with verbal (voice-reco + speech-generation)
> hand-held machine-translation devices. You don't need a lot of
> imagination to realize that you can hook up one of these
> self-driving cars to the voice unit. We really aren't very many years
> away from being able to talk to our cars: "watch out for that table in
> the middle of the road!", followed by some confusion as to whether
> the mass detected by radar is a table or not. Such a car would
> have an intelligence of (much) less than a human child, yet
> none-the-less, it would be a talking car.
>
> Should we collect a corpus of automotive utterances: things my
> machine said to me? Should we run statistical tools on this corpus?
> Should we argue about what the machine meant, when it said "watch
> out for that table!"? Should we argue about the "meaning of meaning"
> when we talk about the corpus of emulated English spoken by
> machines?
>
> Enough. Let me quibble a bit:
>
>> no machine that has intentionality.
>
> Well, the self-driving vehicle has the 'intent' of not hitting
> anything
> as it intentionally moves from point A to point B.
>
>> The result of summarisation is predictable;
>
> But may still be surprising. If I ask my talking, self-driving machine
> to summarize the trip, it may well utter some unexpected remarks.
>
> In physics, there are dynamical systems exhibiting "chaotic
> behaviour": systems that are essentially unpredictable. These
> systems can be very simple: a few weights, a spring, a magnet.
> While being deterministic, (described by fixed mathematical
> equations of motion), they remain unpredictable.
>
> Now, computers are deterministic, doing whatever the
> software tells them to do, but, as dynamical systems, are
> far far more complex than a few springs and a magnet.
> While in principle a computer is "predictable", in practice
> they are not: they only seem predictable because the
> programmers have taken great pains to make sure that
> their software does not surprise the user.
>
>> All mechanical, rule-based ways for describing the meaning of
>> 'table' (including the statistical devices developed by corpus
>> linguists) cannot replace our collaborative interpretation of the
>> word as it crops up in the discourse.
>
> Why not? All due respect to humans, but, come the day we
> have self-driving cars (having an IQ of 2) that can spot tables
> in the middle of the road, and can talk about them, we seem to
> reach a crisis.
>
> --linas
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list