New Corpus

Charles Meyer meyer at UMBSKY.CC.UMB.EDU
Sat Nov 28 13:29:56 UTC 1998


The Survey of English Usage, University College London, is pleased to
announce the release of the ICE-GB corpus, the British component of the
International Corpus of English (ICE).

ICE-GB is a fully parsed corpus of adult British English from the 1990s.
It contains 300 spoken texts and 200 written texts  -  a total of 1
million words. The texts are distributed across 32 categories, including
private conversations, telephone calls, court proceedings, broadcasts,
social letters, examination scripts, and academic writing.

ICE-GB has been grammatically analysed at wordclass level, and at the
function and category levels. The analyses are presented as labelled
syntactic trees  -  83,419 trees in total.

The corpus is distributed with its own dedicated retrieval software,
ICECUP.

ICE-GB and ICECUP are available now on CD-ROM.

A Sample Corpus of ten parsed texts, together with ICECUP, may be
downloaded free from the Survey website, at
http://www.ucl.ac.uk/english-usage/

With apologies for cross postings.



More information about the Funknet mailing list