[Corpora-List] BNC Baby : new xml corpora and software

Lou Burnard lou.burnard at computing-services.oxford.ac.uk
Tue Oct 19 13:41:45 UTC 2004

[apologies for any inconvenience caused by cross-posting]

                                 **** BNC BABY NOW AVAILABLE *****

Oxford University's Research Technologies Service is pleased to announce
availability of BNC Baby: a new sampler CD designed to demonstrate the
full potential of corpus linguistics in the teaching of English language
and literature.  For more information, please visit our website at

BNC Baby contains three different XML corpora, complete with detailed
linguistic analysis and fully TEI-compliant markup. It also includes the
latest release of XAIRA (XML Aware Indexing and Retrieval Architecture)
developed at Oxford specifically for linguistically-motivated text
analysis of large XML corpora.

The three corpora are:
* BNC Baby: four one-million word extracts from the British National
Corpus, representing informal conversation, academic prose, fiction, and
* The Unknown Shakespeare: complete works of William Shakespeare in a
normalized spelling edition prepared at North Western University
* The Brooklyn Corpus of Old English: sixteen major texts of Old English
prose made available by the Oxford Text Archive

The software included on the CD can be installed on any machine running
Microsoft Windows 2000 or XP, and is supplied with built-in help and
links to tutorial materials under development as part of the Xaira
project. The software can also be used to index other XML- conformant
corpora, of any kind and in any language. Free updates of the software
will be made available through the Xaira project, the object of which is
to develop an open-source platform-independent toolkit (see
http://www.xaira.org for more information about Xaira)

To order your copy, go to http:natcorp.ox.ac.uk/ordering.html
Individual copies of the CD, including despatch by first class mail,
cost 30 euros, plus VAT within the EU. For orders of ten copies or
more, the unit price drops to 10 euros.

Lou Burnard

More information about the Corpora mailing list