[Corpora-List] Questions about collocations and collocation extraction tools

Martin Wynne martin.wynne at oucs.ox.ac.uk
Wed Aug 2 09:59:44 UTC 2006


> Althought the BNC Baby does'nt claim to be representative of the whole
> BNC, it may suffer of the same typological 'text types' bias analyzed by
> David Lee in his PhD dissertation. The article http://llt.msu.edu/vol5num3/lee/default.html
> should give you an idea of the way he analyzes the metadata of the BNC
> texts to discuss genre, register, text type, domain and style representativity
> of the BNC. He designed the "BNC Index" to reclassify all the BNC texts
> with a didactic perspective.

If anyone is interested in how the texts in BNC Baby were actually 
selected, then please take a look at:

http://www.natcorp.ox.ac.uk/corpus/baby/

It is clear from this that the text selections were based on David Lee's 
text classifications, where these were relevant.

Please also note that David Lee's classifications are included in the 
metadata in current and proposed future releases of the BNC.

Martin

-- 
Martin Wynne
Head of the Oxford Text Archive and
AHDS Literature, Languages and Linguistics

Oxford University Computing Services
13 Banbury Road
Oxford
UK - OX2 6NN
Tel: +44 1865 283299
Fax: +44 1865 273275
martin.wynne at oucs.ox.ac.uk



More information about the Corpora mailing list