[Corpora-List] The Bank of English as a monitor corpus

Mark Davies Mark_Davies at byu.edu
Mon Sep 21 17:29:22 UTC 2009


A number of books and articles on corpus linguistics have commented on the (possible) use of the Bank of English as a monitor corpus (e.g. McEnery and Wilson 2001:30-31, Meyer 2002:15, Hunston 2002:30-31, Sampson and McCarthy 2004:396-98, McEnery et al 2006:67-70, Baker et al 2006:65).

I'm having trouble, however, finding many articles that have *actually used* the BoE as a monitor corpus -- i.e. looking at recent *changes* in English (rather than just synchronic phenomena). I can find Rudanko 2003, 2005, 2007, and that's about it. I've mainly been using the LLBA and MLA bibliographies to look for these.

I've been using the BoE myself recently (via wordbanks.harpercollins.co.uk), and I'm finding it a bit problematic. The genre balance changes so much over time that I'm not quite sure how one could/would use the data. For example, fiction is 29% of the US part of the corpus for 1990-94, but only 16% for 1995-99. Therefore, for any given change, one could never really know if it reflects actual shifts in the language, or whether it is simply an artifact that represents a changing genre balance in the corpus.

Anyway, if anyone is aware of other studies dealing with this topic, I'd appreciate hearing about them. I'll post the responses if there is sufficient interest.

Thanks in advance,

Mark

============================================
Mark Davies
Professor of (Corpus) Linguistics
Brigham Young University
(phone) 801-422-9168 / (fax) 801-422-0906

http://davies-linguistics.byu.edu

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================ 


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list