[Corpora-List] Boot Camp (Continued...)

W. Louw louw at mango.zw
Mon Aug 18 16:20:03 UTC 2008


Hello All

I have been waiting for the measured voice of reason... and it has come from
Geoffrey. I agree with him unreservedly. But then I would, wouldn't I?

I particularly like his devotional metaphors and think of building upon them as
follows. We all like the idea that corpus studies is a broad church (see the
examples of daily use cited by Stefan Gries), but although the spirit of
ecumenism is abroad in the corpora list (hope you got the prosody... Michael and
Sam are exempted), we need to pose this question:  if corpora are for general
use in a spirit of ecumenism, will they be fulfilling the purpose for which they
were created? John Sowa will appreciate this question because it is deeply
Wittgensteinian. Chomsky started by showing the corpus the door in a footnote in
Aspects (1964) and it never got invited back in the house.

Corpora used to be tiny (1million words traditionally) and held on heavy tapes
that could be thrown onto desks with a shout of _habeas corpus!_ (heard less
frequently since Gitmo). However, John Sinclair brought them to the point at
which they are in excess of half a billion words of running text. He did this in
order to write the very first computer-concordanced dictionary: COBUILD. A
dictionary like that is amazing. It samples the world. And Wittgenstein told us 
"The world is all that is the case... the totality of facts not of things."
Sinclair found that 7.5 million was not enough to accomplish that sample.
Between 18 and 21 would just about do it. Half a billion is amazing: almost 4
times what was used to get the 2nd edition out. John also wanted the corpus to
be a living entity and Antoinette saw to that. I recall the output of AVIATOR on
28 September 1991 in her office. The machine asked what is 'ethnic cleansing'?
It had just found it for the first time in the newspaper line feed for that day!
Wolfgang is an expert on monitor corpora, and so on. The corpus became an
AUTHORITY and reached the stage where it was there to be TRUSTED (exactly as
Geoffrey observes). The area where it inspires total trust is in collocation
(John called it hidden meaning, as I said earlier) and semantic prosody, because
collocation finds all that is seen and unseen (slightly off-limits for that
reason... sadly). It is not just a rival for the mind, it dwarfs it. Cognitive
celebrates what we know, corpora what we don't know. It does this so powerfully
that it has begun to settle all uncertain knowledge about language. Yet, the
linguists often sail blithely on without looking to those whose area this is:
philosophy of language (Chapman, Sowa, Louw etc) and science (Williams, .Kitcher
etc.). Language will become its own instrumentation through those who push for
collocation (well done Ramesh and Rosamund also) and thanks mainly to John.
Language may be the last area of study on earth to become science, but let it be
hard science, not reductionism to flatter mentalism. Wittgenstein believed the
psychologists because there were no computers and he wrote three and a half
volumes on the philosophy of psychology, but the silence today is deafening and
there is no excuse. It all looks a tad ingratiating and irresponsible,
especially if we know we have a duty to move towards science.

Bill

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list