Corpora: Children's or Graded Corpora Query Results

T Murphy tmorpheme at hotmail.com
Thu Jul 6 01:40:29 UTC 2000


I believe it's my obligation to report back to the list on the results of my
query. I asked about the "existence of children's corpora -- in the sense of
books for children, whether textbooks, novels for children or young adults,
vocabulary-graded readers and so on".

The results weren't fantastic, but it looks as though it may get better over
the next few years. Copyright problems seem to be a major concern at the
moment, slowing progress on the corpora that are being created.

1. Quentin Allan, who is with the Teachers of English Language Education
Centre (TELEC) in the Department of Curriculum Studies at the University of
Hong Kong, provided me with what I thought was the most promising lead.

He is in the process of developing a TeleCorpus, a computer-based collection
of writing which is relevant to primary and secondary teachers in Hong Kong.

TeleCorpus is in two parts: the TeleNex Learner Corpus (primary and
secondary), and the TeleNex Corpus of Modern English. It¡¯s part of the
TELEC project (Teachers of English Language Education Centre).

The relevant part may be found here: http://www.telenex.hku.hk

They're also developing a concordancer for English teachers to use as an
access tool for the various corpora they've developed:

The TeleNex Corpus collection consists of:

     a corpus of primary students' writing
     a corpus of secondary students writing (and transcriptions of oral
presentations from
     Form 7 students)
     a corpus of general English (transcriptions of spoken English from the
UK, feature
     articles from the SCMP, etc.)
     a corpus of texts which are relevant to the primary level English
classroom (graded
     readers, fairy tales, etc.)

2. For classic children's literature, I was reminded by Christopher Tribble
to check out Project Gutenberg http://www.promo.net/pg/.

3. Prof. Geoffrey Sampson noted that "some people at Leeds gathered corpora
of children's own writing at two ages, I think 9 and 12, back in the 1960s
or 1970s". Prof. Sampson has copies of the published version.

4. Tony McEnery has constructed some small corpora of the writing of
children.

5. David Lee reminded the corpora list that the BNC contains a large number
of texts by both children and teens.

6. Finally, Leonel Ruiz of the Center of Applied Linguistics in Santiago de
Cuba, Cuba wrote me that the Center has done a study on the vocabulary of
Cuban children and that they have an interesting corpus of children's
vocabulary in Spanish.

My thanks to all who responded to the query.

Dr. Terry Murphy
Department of English,
Yonsei University
Seoul, Korea

________________________________________________________________________
Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com



More information about the Corpora mailing list