[Corpora-List] (continued)

Khurshid Ahmad kahmad at cs.tcd.ie
Tue Aug 19 11:52:47 UTC 2008


Dear Lou
A voice of reason from Oxford (or shall I say another voice of reason).

Language teaching did inspire many of the living and deceased proponents
of corpus-based studies.  The creation of good up-to-date dictionaries was
facilitated by having an access to corpora; the lexicographic-intuition
does kick-in at some stage (text selection, evidence selection, tagger
selection etc) if only because lexicographers were asked to focus on
'contemporary British/American English' and lexicographers are human and
have intuition. (I do have a potted history in one of my papers - see
below for reference)

Research in translation has benefited from parallel corpora and has
resulted in systems that FAHQ (fully-automated high quality) translation
theorists never could.  And, indeed corpora play an important in research
and development there as well.

The use of corpora for deriving keywords lists, not in the sense the
purists would know a corpus, for indexing arbitrary sets of documents for
search engines (like Google and others) is another important use of
corpora.

And, yes, text summarisation may not give us a major insight into what
Wolfgang will call meaning which may be hermenutically encoded, but does
help you in skimming documents if you are a librarian or an archivist!

It is not easy to answer the (rhetorical?) question 'what corpus
linguistics is for?'.  But the iconclastic nature of the enterprise, going
against the grain of cognitivism, was very apparent from the very
beginning.  History of science tells us that the progress of science is
sometimes facilitated by an iconclastic attitude. Physics, an iconclasm
for its time, is/was rooted in natural philosophy.  Natural philosophy was
a term to distinguish the enterprise from religious philosophy (divinity),
ethics, and other abstract entities.  Physics did split from philosophy
eventually because of its intense material focus, but the touches of
philosophy (and divinity) still remain (determinism, relativity,
God-particle and Big Bang).  The subsequent use of physics by engineers
obscures not only the philosophy but some of the physicists more cherished
notions.

Corpus linguistics then is another way of looking at language and its
digression from the main-stream, and the reminiscences of its early
protagonists, is very welcome.  If the results of the research in a
subject are being applied, and if some of our colleagues manage to get
themselves in mega projects, then it is all to the greater good of corpus
linguistics.

Following John Sinclair's interment in Florence, somebody read out Michael
Halliday's tribute to the man and I paraphrase: John use to say 'trust the
text, I say trust John Sinclair'.  Trust the intuition of a trusted
empiricist?

Reefernce:
Ahmad, K.  ‘Being in Text and Text in Being: Notes on representative
texts’.  In Eds.  G. Anderman and M. Rogers.  Incorporating Corpora: The
Linguist and the Translator.  Clevedon: Multilingual Matters.  pp 60-94.


> Lou (who really shouldn't post to public lists before having had his
> second coffee of the day) says "Sadly, none of [the language-teaching]
> community seems to have seen fit (yet) to contribute to the present
> discussion. "
>
> Geoffrey Williams is very definitely a member of that community, and so
> I believe is Dr Louw. Apologies to both for misrepresenting their
> respective allegiances.
>
> I stand by my assertion that the debate hasn't taken fully into account
> the transformative effects of corpora and corpus-methods in that
> specific discipline though.
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>


Khurshid Ahmad

Professor of Computer Science
Department of Computer Science
Trinity College,
DUBLIN-2
IRELAND
Phone 00 353 1 896 8429

Web Page: http://people.tcd.ie/kahmad


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list