[Corpora-List] Texts 1900-1970: one more

Smith, Nicholas smithni at exchange.lancs.ac.uk
Thu Dec 15 18:05:58 UTC 2005


Dear Chris, list members,

We are nearing completion of a corpus of printed texts produced in 1931 (+/- 3 years), and have begun compiling a similar corpus of texts produced in 1901 (+/- 3 years).
Both corpora are modelled on the LOB and FLOB corpora of British English, sampling 1961 and 1991 respectively.
We expect to release the 1931 corpus next year, after clearing copyright permissions.
http://www.comp.lancs.ac.uk/ucrel/projects.html#prelob

Geoff Leech, Nick Smith, Paul Rayson
Lancaster University.
 



> -----Original Message-----
> From: owner-corpora at lists.uib.no [mailto:owner-corpora at lists.uib.no]On
> Behalf Of Chris Butler
> Sent: 15 December 2005 07:55
> To: corpora at hd.uib.no
> Subject: [Corpora-List] Texts 1900-1970
> 
> 
> My thanks to the following people, who all provided information on the
> availability of texts: Wendy Anderson, Carmela Chateau, 
> Constantin Orasan,
> Raf Salkie, Dirk Siepmann, Pedro Ureña, Romain Vanoudheusden. 
> The sources
> which were suggested are as follows:
> 
> There are old (and some recent) texts at the project Gutenberg.
> www.gutenberg.org/
> 
> the public library of science has open access texts.
> http://www.plos.org/about/openaccess.html
> 
> A selection of online math text books
> http://www.math.gatech.edu/~cain/textbooks/onlinebooks.html
> 
> the Intratext digital library (contains many religious texts, 
> as well as a
> lot of literature)
> http://www.intratext.com/
> 
> The SCOTS Corpus (which is freely accessible and searchable at
> www.scottishcorpus.ac.uk) contains texts in Scottish English 
> (as well as
> dialects of Scots), from 1940 to the present day.
> 
> The New York Times Archive
> (http://pqasb.pqarchiver.com/nytimes/advancedsearch.html) 
> goes back to 19th
> century
> 
> The collection of texts hosted by archive.org
> (http://www.archive.org/details/texts) includes texts from 
> the Gutenberg
> Project
> 
> The Victorian Literary Studies archive at
> http://victorian.lang.nagoya-u.ac.jp/index.html, which has a 
> list of authors
> at http://victorian.lang.nagoya-u.ac.jp/concordance.html
> 
> The archive at www.questia.com
> 
> ******
> 
> I'd also like to mention the Corpus of Late Modern English 
> Texts compiled by
> Hendrik de Smet at the Catholic University of Leuven
> (http://perswww.kuleuven.be/~u0044428/), a principled 
> collection of texts
> (10 million words, 1720-1920) drawn from archives such as 
> Project Gutenberg
> and the Oxford Text Archive. A username and password must be 
> obtained from
> Hendrik (Hendrik.desmets at arts.kuleuven.be) in order to access 
> the corpus.
> 
> Chris Butler
> Honorary Professor, University of Wales Swansea, UK
> 
> 
> 
> 
> 
> 
> 
> 



More information about the Corpora mailing list