[Corpora-List] Newspaper article corpora

Chris Jordan chris.jordan at acm.org
Thu May 11 13:09:32 UTC 2006


There is also a free Reuters one put together by Lewis:
http://www.daviddlewis.com/resources/testcollections/reuters21578/

Chris

Mark Davies wrote:

>>Does anyone know of corpora devoted to English language 
>>newspaper articles (possibly available online)?  I saw that 
>>Reuters has developed one, but there are a few hurdles to 
>>navigate in order to access it.
>>    
>>
>
>The BNC has about 10-11 million words from newspapers, and is available
>as part of the VIEW site (http://view.byu.edu). 
>
>This site also allows you to search just one part of the 100 million
>word corpus (i.e. just the newspapers), so this may be a useful feature
>for you. Finally, you can even zero in on just sub-registers of
>newspapers (broadsheet, tabloid, sports reporting, etc).
>
>Best,
>
>Mark Davies
>
>=================================================
>
>Mark Davies
>Assoc. Prof., Linguistics
>Brigham Young University
>(phone) 801-422-9168 / (fax) 801-422-0906
>
>http://davies-linguistics.byu.edu
>
>** Corpus design and use // Linguistic databases **
>** Historical linguistics // Language variation **
>** English, Spanish, and Portuguese **
>
>================================================= 
>
>  
>



More information about the Corpora mailing list