Corpora: Children's or Graded Corpora

David Lee dave at davidlee00.freeserve.co.uk
Wed Jun 14 01:22:33 UTC 2000


Dear Terry,

I hope I'm not beginning to sound like a broken record by constantly
mentioning the British National Corpus (BNC), but there really is a wide
variety of material available within that corpus (it was designed that
way!)... if only it were easy for people to see what's in it. I'll be
shortly announcing the release of my "BNC Index", which is intended to
facilitate just such an exploration of the jungle (and jumble) of BNC
texts, but, in the meantime, I hope the following helps:

There are 44 files in the BNC, totalling about 930,100 words, which are
coded as having 'children' as the target audience. These are almost all
prose fiction texts, with only a handful of magazines ("Brownie"), a
couple of miscellaneous/non-fiction texts, and one maths textbook (!).

Another 74 files are for 'teenagers', totalling 1,702,392 words. Again,
most are fiction texts, but there are also texts written *by* teenagers
(school essays), and several magazines and non-fiction texts intended
for a young readership.


Perhaps some of this material is suited to your needs.


David Lee

P.S. On a personal note, I have not been able to access my Hotmail
account for the past 4 days due to technical problems at their end, so
if anyone on the list has sent me any mail recently that needs a reply,
please re-send it to the present e-mail account. Thanks.

-----------------------------------------------------------------
David YW Lee          **************************************
Dept of Linguistics        *   Stop the narrowing of minds   *
Lancaster University     *   Affirm the diversity of life         *
Lancaster LA1 4YT      ***************************************
England, UK.

Email: david_lee00 at hotmail.com (main account) or
       dave at davidlee00.freeserve.co.uk (when Hotmail isn't working!)
-----------------------------------------------------------------



More information about the Corpora mailing list