[Corpora-List] Corpus of children ´s language

Eric Atwell csc6ea at leeds.ac.uk
Mon Jan 17 16:59:20 UTC 2011


Martina,
you could try the PoW Corpus originally collected at the Polytechnic of
Wales (but of English-speaking children!)
  - also PoS-tagged and parsed using Systemic Functional Grammar:

http://kh.hd.uib.no/icame/manuals/pow.htm 
"... The corpus was originally collected between 1978-84 for a child
language development project to study the use of various syntactico-semantic
constructs in children between the ages of six and twelve. A sample of
approximately 120 children in this age range from the Pontypridd area in South Wales was
selected, and divided into four cohorts of 30, each within three months
of the ages 6, 8, 10, and 12. These cohorts were subdivided by sex (B,G) and
socio-economic class (A,B,C,D)...
... the parsed corpus consists of approximately 65,000 words
in 11,396 (sometimes very long) lines, each containing a parse tree. The
corpus of parse trees fills 1.1 Mb."


Eric Atwell, Senior Lecturer, Language research group,
  I-AIBS Institute for Artificial Intelligence and Biological Systems
  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468

On Mon, 17 Jan 2011, Martina Bredenbröcker wrote:

> Dear all,
>
> I am looking for a corpus of spoken children´s language in English.
> Ideally, it should have (transcribed) samples of eight to ten year old
> school children. Any suggestions would be greatly appreciated.
>
> Thanks a lot!
> Martina Bredenbroecker
>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list