[Corpora-List] offer of research resource

Martin Wynne martin.wynne at oucs.ox.ac.uk
Wed Jun 28 09:04:27 UTC 2006


Dear Geoffrey and everyone,

I've had several messages offline asking why the OTA doesn't offer to 
take this resource, so before anyone else asks, I should point out that 
the Oxford Text Archive and the Arts and Humanities Data Service only 
archive electronic resources, and so, unfortunately, would not be able 
to offer a home for this valuable data in its current state. As I 
understand it, what is needed is a traditional archive for paper 
documents and magnetic media, or a project to digitise the data. (But 
please correct me if I'm wrong, Geoffrey.)

If anyone out there is in a position to consider undertaking a project
to digitise it, then I understand that Professor Sampson already has a 
detailed workplan. To make life even easier, the AHDS would be very 
happy to offer a free service to archive, catalogue, preserve and 
distribute the electronic data, on a non-exclusive basis. We could also 
give advice on digitisation, if needed.

Best wishes,
Martin

-- 
Martin Wynne
Head of the Oxford Text Archive and
AHDS Literature, Languages and Linguistics

Oxford University Computing Services
13 Banbury Road
Oxford
UK - OX2 6NN
Tel: +44 1865 283299
Fax: +44 1865 273275
martin.wynne at oucs.ox.ac.uk


Geoffrey Sampson wrote:
> Dear Colleagues,
> 
> I am looking for someone who would be interested in taking over
> responsibility for a valuable research resource I have been in charge of
> in recent years.
> 
> During the 1960s, a team of linguists sponsored by the Nuffield
> Foundation assembled a collection of the spontaneous spoken and written
> English of children and young people aged between 8+ and 15+ attending a
> variety of schools of diverse types in different urban and rural English
> regions:  the "Child Language Survey".  (This was initially intended as
> part of a multinational effort directed at improving foreign-language
> teaching in Europe, but I understand that parallel efforts in other
> countries fell through; the material has essentially been gathering dust
> more or less ever since it was compiled.)  The leading member of the
> team was Richard Handscombe, now long since retired from a Canadian
> university and in indifferent health.  After I used a small portion of
> the Survey for my LUCY treebank (www.grsampson.net/RLucy.html), Richard
> generously suggested that I should take charge of the entire Survey
> material, and arranged for it to be transported to my workplace in
> Sussex, where it now is.
> 
> Since then, I have made repeated attempts to get funding to computerize
> this material, clearly a necessary first step to unlocking the research
> potential it contains.  Although referees' reports on my various grant
> applications have been outstandingly positive, unfortunately no
> application has finally succeeded.  I now find myself too close to
> retirement for a further application to be worth making; even if I
> secured funding now, I would not have time to see the work through to
> completion.  Hence I would be interested in hearing from anyone younger
> who might succeed where I have failed.
> 
> In my view the collection has unparalleled potential scientific value.
> In the first place, it creates a possibility (which otherwise scarcely
> exists) of comparing spontaneous English usage across several decades of
> time -- children of the 1960s with children now, and/or the usage of a
> generation in childhood with the usage of the same generation now it is
> middle-aged.  One can envisage many significant applications to the
> study of language-skills education, for instance.  One anonymous grant
> referee in 2005 commented:
> 
>      "there is a yawning gap where there should be a research literature
> on grammatical development at school age (contrasting with a rich supply
> of research on both pre-school children and adults).  What is needed
> more than anything else is precisely what this project offers:  age-
> related data on speech and writing from the same children ..."
> 
> The written portion of the material represents children's spontaneous
> writing abilities in a way which in my experience is hard to match even
> for present-day children. Collections of child writing often turn out to
> be heavily influenced by the adult prose they have consulted, but the
> Child Language Survey compilers found clever ways to get at what the
> children could do under their own steam.  And the quality of the
> collection is extremely high.  The spoken material has been transcribed
> with an accuracy that compares very favourably with the speech
> transcriptions in the British National Corpus (and I have the original
> tape-recordings as well as the transcriptions).  The written material
> has been converted from the children's handwriting into typescript with
> astonishing care, so that for instance every crossed-out letter is
> identified.  As a very rough estimate, the whole might comprise about
> 800,000 words of speech and about 200,000 words of writing.
> 
> It will be a minor scientific tragedy, to my mind, if this material is
> lost to scholarship.  Yet, if I cannot find a suitable home for it
> fairly soon, that fate looks unavoidable.
> 
> Accordingly, I should be very happy to hear from anyone who feels able
> to rescue the Child Language Survey from oblivion.  After handing it
> over, I would be willing, indeed eager, to retain an involvement, to the
> extent of advising on what I know about it, etc., but decisions would be
> for the new owner to make:  I have no wish to be a back-seat driver.  I
> would be quite willing to transfer the collection out of Britain -- I
> have the impression that scholarly values may be in a better state in
> some Continental European countries, for instance, than they are in
> British universities nowadays.  (And I would be glad to supply
> documentation on my grant applications, referee reports, etc., if they
> would help someone else construct a case for support.)
> 
> Anyone who would like to be considered is invited to contact me,
> commenting briefly on how he or she would hope to publish and/or exploit
> the material, and we can take it from there.
> 
> Geoffrey Sampson
> 
>  
> ............................................................
>      Prof. Geoffrey Sampson  MA PhD MBCS CITP ILTM
> 
>      author of "The 'Language Instinct' Debate"
> 
>      Department of Informatics, University of Sussex
>      Falmer, Brighton BN1 9QH, England
> 
>      www.grsampson.net     +44 1273 678525
> ............................................................
> 
> 
> 



More information about the Corpora mailing list