[Corpora-List] Query: Corpora of American and British English that can be compared?

Mark Davies Mark_Davies at byu.edu
Thu Dec 20 13:59:49 UTC 2012


>> For my research I need to compare one set of agreement patterns in   American and British English.

I'm not sure exactly what construction(s) you're looking for. If it's a very high frequency phenomenon, then Brown/LOB/Frown/FLOB or other 1-5 million word corpora may work fine.

For medium and low-frequency phenomena, you'll probably need something much larger, such as the 450 million word Corpus of Contemporary American English (COCA; http://corpus.byu.edu/coca) and the 100 million word British National Corpus (one version at http://corpus.byu.edu/bnc). The advantage of COCA/BNC at BYU is that you can compare the results "side by side" -- see http://corpus.byu.edu/comparing-corpora.asp for a number of examples.

I'm finishing up a two billion word corpus of web-based English from 20 different countries. It contains over 400 million words from British English and 400 million words of American English, and the text collection was identical for each country. This should be publicly-available within a few months. A sample screen shot (for the phrase < likely [v*] >, e.g. "they would likely know the answer") can be seen at: http://corpus.byu.edu/files/global_english.gif.

Best,

Mark Davies

============================================
Mark Davies
Professor of Linguistics / Brigham Young University
http://davies-linguistics.byu.edu/

** Corpus design and use // Linguistic databases **
** Historical linguistics // Language variation **
** English, Spanish, and Portuguese **
============================================

________________________________________
From: corpora-bounces at uib.no [corpora-bounces at uib.no] on behalf of Laure Gardelle [laure.gardelle at ens-lyon.fr]
Sent: Tuesday, December 18, 2012 2:23 AM
To: corpora at uib.no
Subject: [Corpora-List] Query: Corpora of American and British English that can be compared?

Dear colleagues,

For my research I need to compare one set of agreement patterns in
American and British English.
So would anyone know of two corpora (one for American English, the
other for British English) that would have sufficiently close
collection procedures for the hits they return to be compared (ie. for
possible differences in proportion to be considered meaningful)??
Ideally I am looking for contemporary English, but if the data are a
bit older, it is not a problem.

Many thanks in advance for any help with this!

Laure Gardelle

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list