Corpora: Historical background of Corpus Linguistics

Eric Atwell eric at comp.leeds.ac.uk
Thu Apr 18 12:06:54 UTC 2002


Ramesh said:
> ... perhaps *the* earliest publication of linguistic research using an
> electronic corpus was: ...

...but don't forget even earlier Corpus Linguistics research done
without computers.  For example modern Language Engineering researchers
extract Zipf distributions and Markov models from corpora; this was
done earlier "by hand" :

Zipf, George Kingsley (1936) "The psycho-biology of language : an
introduction to dynamic philology" London : G. Routledge & sons

Markov, A.A. (1913) "Essai d'une recherche statistique sur le texte du
roman 'Eugene Onegin' illustrant la liaison des epreuve en chain"
Izvestia Imperatorskoi Akademii Nauk (Bulletin de l'Academie Imperiale
des Sciences de St-Petersbourg) 7:153-162.

Does anyone have an earlier citation???


Eric Atwell

PS Leeds library has Zipf book but I dont actually have a copy of Markov paper,
I copied the citation from Jurafsky&Martin(2000) "Speech and Language
Processing" Prentice Hall - can someone let me have a copy please PLEASE?

--
Eric Atwell, Distributed Multimedia Systems MSc Tutor & SOCRATES Tutor
School of Computing, University of Leeds, LEEDS LS2 9JT
TEL: 0113-2335430  MOBILE: 0775-1039104 FAX: 0113-2335468
WWW: http://www.comp.leeds.ac.uk/eric  EMAIL: eric at comp.leeds.ac.uk



More information about the Corpora mailing list