[Corpora-List] offer of research resource
Martin Wynne
martin.wynne at oucs.ox.ac.uk
Wed Jun 28 09:04:27 UTC 2006
Dear Geoffrey and everyone,
I've had several messages offline asking why the OTA doesn't offer to
take this resource, so before anyone else asks, I should point out that
the Oxford Text Archive and the Arts and Humanities Data Service only
archive electronic resources, and so, unfortunately, would not be able
to offer a home for this valuable data in its current state. As I
understand it, what is needed is a traditional archive for paper
documents and magnetic media, or a project to digitise the data. (But
please correct me if I'm wrong, Geoffrey.)
If anyone out there is in a position to consider undertaking a project
to digitise it, then I understand that Professor Sampson already has a
detailed workplan. To make life even easier, the AHDS would be very
happy to offer a free service to archive, catalogue, preserve and
distribute the electronic data, on a non-exclusive basis. We could also
give advice on digitisation, if needed.
Best wishes,
Martin
--
Martin Wynne
Head of the Oxford Text Archive and
AHDS Literature, Languages and Linguistics
Oxford University Computing Services
13 Banbury Road
Oxford
UK - OX2 6NN
Tel: +44 1865 283299
Fax: +44 1865 273275
martin.wynne at oucs.ox.ac.uk
Geoffrey Sampson wrote:
> Dear Colleagues,
>
> I am looking for someone who would be interested in taking over
> responsibility for a valuable research resource I have been in charge of
> in recent years.
>
> During the 1960s, a team of linguists sponsored by the Nuffield
> Foundation assembled a collection of the spontaneous spoken and written
> English of children and young people aged between 8+ and 15+ attending a
> variety of schools of diverse types in different urban and rural English
> regions: the "Child Language Survey". (This was initially intended as
> part of a multinational effort directed at improving foreign-language
> teaching in Europe, but I understand that parallel efforts in other
> countries fell through; the material has essentially been gathering dust
> more or less ever since it was compiled.) The leading member of the
> team was Richard Handscombe, now long since retired from a Canadian
> university and in indifferent health. After I used a small portion of
> the Survey for my LUCY treebank (www.grsampson.net/RLucy.html), Richard
> generously suggested that I should take charge of the entire Survey
> material, and arranged for it to be transported to my workplace in
> Sussex, where it now is.
>
> Since then, I have made repeated attempts to get funding to computerize
> this material, clearly a necessary first step to unlocking the research
> potential it contains. Although referees' reports on my various grant
> applications have been outstandingly positive, unfortunately no
> application has finally succeeded. I now find myself too close to
> retirement for a further application to be worth making; even if I
> secured funding now, I would not have time to see the work through to
> completion. Hence I would be interested in hearing from anyone younger
> who might succeed where I have failed.
>
> In my view the collection has unparalleled potential scientific value.
> In the first place, it creates a possibility (which otherwise scarcely
> exists) of comparing spontaneous English usage across several decades of
> time -- children of the 1960s with children now, and/or the usage of a
> generation in childhood with the usage of the same generation now it is
> middle-aged. One can envisage many significant applications to the
> study of language-skills education, for instance. One anonymous grant
> referee in 2005 commented:
>
> "there is a yawning gap where there should be a research literature
> on grammatical development at school age (contrasting with a rich supply
> of research on both pre-school children and adults). What is needed
> more than anything else is precisely what this project offers: age-
> related data on speech and writing from the same children ..."
>
> The written portion of the material represents children's spontaneous
> writing abilities in a way which in my experience is hard to match even
> for present-day children. Collections of child writing often turn out to
> be heavily influenced by the adult prose they have consulted, but the
> Child Language Survey compilers found clever ways to get at what the
> children could do under their own steam. And the quality of the
> collection is extremely high. The spoken material has been transcribed
> with an accuracy that compares very favourably with the speech
> transcriptions in the British National Corpus (and I have the original
> tape-recordings as well as the transcriptions). The written material
> has been converted from the children's handwriting into typescript with
> astonishing care, so that for instance every crossed-out letter is
> identified. As a very rough estimate, the whole might comprise about
> 800,000 words of speech and about 200,000 words of writing.
>
> It will be a minor scientific tragedy, to my mind, if this material is
> lost to scholarship. Yet, if I cannot find a suitable home for it
> fairly soon, that fate looks unavoidable.
>
> Accordingly, I should be very happy to hear from anyone who feels able
> to rescue the Child Language Survey from oblivion. After handing it
> over, I would be willing, indeed eager, to retain an involvement, to the
> extent of advising on what I know about it, etc., but decisions would be
> for the new owner to make: I have no wish to be a back-seat driver. I
> would be quite willing to transfer the collection out of Britain -- I
> have the impression that scholarly values may be in a better state in
> some Continental European countries, for instance, than they are in
> British universities nowadays. (And I would be glad to supply
> documentation on my grant applications, referee reports, etc., if they
> would help someone else construct a case for support.)
>
> Anyone who would like to be considered is invited to contact me,
> commenting briefly on how he or she would hope to publish and/or exploit
> the material, and we can take it from there.
>
> Geoffrey Sampson
>
>
> ............................................................
> Prof. Geoffrey Sampson MA PhD MBCS CITP ILTM
>
> author of "The 'Language Instinct' Debate"
>
> Department of Informatics, University of Sussex
> Falmer, Brighton BN1 9QH, England
>
> www.grsampson.net +44 1273 678525
> ............................................................
>
>
>
More information about the Corpora
mailing list