[Corpora-List] Australian newspaper corpora

Steven Bird sb at csse.unimelb.edu.au
Tue May 8 02:02:10 UTC 2007


On 5/3/07, Monika Bednarek <Monika.Bednarek at phil.uni-augsburg.de> wrote:
> 1) Australian newspaper reportage (apart from the files included in
> the Australian component of the ICE)
> 2) (if possible Australian) newspaper headlines only
> 3) (if possible Australian) newspaper captions only

The NLTK corpus distribution has a year of Australian news text
(science news, rural news),
with headlines separated from stories.

For a list of the 20+ corpora and corpus samples included with NLTK, please see:
http://nltk.sourceforge.net/wiki/index.php/Corpora

-Steven Bird



More information about the Corpora mailing list