[Corpora-List] Query: Corpora of American and British English that can be compared?

Eric Atwell E.S.Atwell at leeds.ac.uk
Thu Dec 20 11:38:43 UTC 2012


I agree with Adam and True Friend: LOB v Brown are the long-standing
established corpora to compare UK v US English, from 1960s.
BUT you asked for  " ... sufficiently close collection procedures 
for the hits they return to be compared ..." whcih suggests you really
want web-as-corpus collections gathered more recently by web-crawlers? 
If so: World Wide English Corpus
http://www.comp.leeds.ac.uk/eric/wwe.shtml
includes 2M-word samples of UK English and US English, collected
using SketchEngine's WebBootCat web-as-corpus harvester,
for student exercises in comparing world varieties of English

Eric Atwell, Leeds University





On Thu, 20 Dec 2012, Adam Kilgarriff wrote:

> Dear Laure,
> the straightforward answer is the 'Brown family' corpora - Brown and LOB
> were compiled with just this kind of analysis in mind: they were both 1961
> and more comparable data points are available for 1991 (FROWN and FLOB) and
> (tho maybe this is British Englsih only) 1931, 1901 and 2006.
> 
> You can do the comparisons easily and directly in the Sketch Engine, where
> the data is already set up (includiung POS-tagged) and the 'Brown family'
> corpus contains all the above except the 1901 part.
> 
> Regards
> 
> Adam
> 
> On 18 December 2012 09:23, Laure Gardelle <laure.gardelle at ens-lyon.fr>
> wrote:
>       Dear colleagues,
>
>       For my research I need to compare one set of agreement patterns
>       in American and British English.
>       So would anyone know of two corpora (one for American English,
>       the other for British English) that would have sufficiently
>       close collection procedures for the hits they return to be
>       compared (ie. for possible differences in proportion to be
>       considered meaningful)?? Ideally I am looking for contemporary
>       English, but if the data are a bit older, it is not a problem.
>
>       Many thanks in advance for any help with this!
>
>       Laure Gardelle
>
>       _______________________________________________
>       UNSUBSCRIBE from this page:
>       http://mailman.uib.no/options/corpora
>       Corpora mailing list
>       Corpora at uib.no
>       http://mailman.uib.no/listinfo/corpora
> 
> 
> 
> 
> --
> ========================================
> Adam Kilgarriff                  adam at lexmasterclass.com                   
>                          
> Director                                    Lexical Computing Ltd          
>      
> Visiting Research Fellow                 University of Leeds      Corpora
> for all with the Sketch Engine                 
>                         DANTE: a lexical database for English              
>     ========================================
> 
>

-- 
Eric Atwell, Associate Professor, Language research group,
  I-AIBS Institute for Artificial Intelligence and Biological Systems
  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
  WWW: http://www.comp.leeds.ac.uk/eric
       http://www.comp.leeds.ac.uk/nlp
       http://www.comp.leeds.ac.uk/arabic
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list