25.12, FYI: ARCHER Corpus Available Online

linguist at linguistlist.org linguist at linguistlist.org
Tue Jan 7 19:12:54 UTC 2014


LINGUIST List: Vol-25-12. Tue Jan 07 2014. ISSN: 1069 - 4875.

Subject: 25.12, FYI: ARCHER Corpus Available Online

Moderator: Damir Cavar, Eastern Michigan U <damir at linguistlist.org>

Reviews: 
Monica Macaulay, U of Wisconsin Madison
Rajiv Rao, U of Wisconsin Madison
Joseph Salmons, U of Wisconsin Madison
Mateja Schuck, U of Wisconsin Madison
Anja Wanner, U of Wisconsin Madison
       <reviews at linguistlist.org>

Homepage: http://linguistlist.org

Do you want to donate to LINGUIST without spending an extra penny? Bookmark
the Amazon link for your country below; then use it whenever you buy from
Amazon!

USA: http://www.amazon.com/?_encoding=UTF8&tag=linguistlist-20
Britain: http://www.amazon.co.uk/?_encoding=UTF8&tag=linguistlist-21
Germany: http://www.amazon.de/?_encoding=UTF8&tag=linguistlistd-21
Japan: http://www.amazon.co.jp/?_encoding=UTF8&tag=linguistlist-22
Canada: http://www.amazon.ca/?_encoding=UTF8&tag=linguistlistc-20
France: http://www.amazon.fr/?_encoding=UTF8&tag=linguistlistf-21

For more information on the LINGUIST Amazon store please visit our
FAQ at http://linguistlist.org/amazon-faq.cfm.

Editor for this issue: Uliana Kazagasheva <uliana at linguistlist.org>
================================================================  

Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.linguistlist.org/
					
					

Date: Tue, 07 Jan 2014 14:12:47
From: David Denison [david.denison at manchester.ac.uk]
Subject: ARCHER Corpus Available Online

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=25-12.html&submissionid=25387810&topicid=6&msgnumber=1
 
We are delighted to announce that ARCHER, A Representative Corpus of Historical English Registers, can for the first time be searched by registered users via the internet. The new version 3.2 also incorporates many improvements, including extensive non-linguistic mark-up to modern standards (TEI, XML), expansion of word-count by 84% to 3.3m words, and correction of existing texts and bibliographic information. 

The corpus runs from 1600 to 1999, allows comparison of British and American English over a 250-year span, and its multiple genres permit subtle sociohistorical discrimination. The CQPweb search engine is fast and easy to use for simple searches, and it also offers more complex searches and statistical information.

A search engine for ARCHER 3.2 is hosted by Lancaster University on its CQPweb server. The version now made available for searches comprises untagged, original-spelling files. The planned VARDed and CLAWS-tagged version will follow as soon as possible and will be made available to registered users, as will an additional online version hosted at the University of Zurich, tagged with the Treebank tagset and also chunked and parsed with a dependency grammar. Further details (including local access arrangements) are given on the ARCHER project website (www.manchester.ac.uk/archer). For copyright reasons, download context is limited, though adequate for most purposes. Users at one of the 14 consortium universities have local access without limits on context and can consult plain text and XML versions. All versions have identical text and non-linguistic mark-up.

The project is currently coordinated at the University of Manchester. You are invited to visit www.manchester.ac.uk/archer for further details of the corpus and the consortium. On the Documentation page, the website has a User Agreement form for you to download. This must be completed and submitted online.

David Denison and Nuria Yáñez-Bouza
On behalf of the ARCHER consortium 



Linguistic Field(s): Historical Linguistics
                     Text/Corpus Linguistics

Subject Language(s): English (eng)





 






----------------------------------------------------------
LINGUIST List: Vol-25-12	
----------------------------------------------------------
Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.linguistlist.org/
					
					



More information about the LINGUIST mailing list