[Corpora-List] a corpus of journalistic data in British

Paul Clough p.d.clough at sheffield.ac.uk
Mon Jun 16 09:08:20 UTC 2003


Hi Afida,

The METER (Measuring Text Reuse) corpus is a collection of around 1700
news stories from the Press Association (the UK's largest news agency)
and nine members of the British Press including both tabloid and quality
press, e.g. The Sun, Daily Mail, The Times, The Guardian. The corpus
consists of stories from two domains: court and law reporting, and
Showbusiness, both recurring and popular domains in the British Press.
The corpus has been encoded in an XML version of TEI, or available as
plain text. If you would like a copy then please email me. For more
information on the corpus and METER project, see:
http://www.dcs.shef.ac.uk/nlp/meter/

If you need a greater volume of text, you can buy copies of some British
dailies on CD, e.g. the Guardian.

Paul Clough.



-----Original Message-----
From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no] On
Behalf Of afida mohd
Sent: 13 June 2003 17:31
To: corpora at hd.uib.no
Subject: [Corpora-List] a corpus of journalistic data in British

Hi everyone,
does anyone know of any corpus of journalistic data in British English?

thanx.
  _____



More information about the Corpora mailing list