[Corpora-List] Query: Corpora of American and British English that can be compared?

Laure Gardelle laure.gardelle at ens-lyon.fr
Thu Dec 20 15:31:39 UTC 2012


Many thanks for your replies, which are very helpful!

All the best

Laure

Eric Atwell <E.S.Atwell at leeds.ac.uk> a écrit :

> I agree with Adam and True Friend: LOB v Brown are the long-standing
> established corpora to compare UK v US English, from 1960s.
> BUT you asked for  " ... sufficiently close collection procedures  
> for the hits they return to be compared ..." whcih suggests you really
> want web-as-corpus collections gathered more recently by  
> web-crawlers? If so: World Wide English Corpus
> http://www.comp.leeds.ac.uk/eric/wwe.shtml
> includes 2M-word samples of UK English and US English, collected
> using SketchEngine's WebBootCat web-as-corpus harvester,
> for student exercises in comparing world varieties of English
>
> Eric Atwell, Leeds University
>
>
>
>
>
> On Thu, 20 Dec 2012, Adam Kilgarriff wrote:
>
>> Dear Laure,
>> the straightforward answer is the 'Brown family' corpora - Brown and LOB
>> were compiled with just this kind of analysis in mind: they were both 1961
>> and more comparable data points are available for 1991 (FROWN and FLOB) and
>> (tho maybe this is British Englsih only) 1931, 1901 and 2006.
>>
>> You can do the comparisons easily and directly in the Sketch Engine, where
>> the data is already set up (includiung POS-tagged) and the 'Brown family'
>> corpus contains all the above except the 1901 part.
>>
>> Regards
>>
>> Adam
>>
>> On 18 December 2012 09:23, Laure Gardelle <laure.gardelle at ens-lyon.fr>
>> wrote:
>>      Dear colleagues,
>>
>>      For my research I need to compare one set of agreement patterns
>>      in American and British English.
>>      So would anyone know of two corpora (one for American English,
>>      the other for British English) that would have sufficiently
>>      close collection procedures for the hits they return to be
>>      compared (ie. for possible differences in proportion to be
>>      considered meaningful)?? Ideally I am looking for contemporary
>>      English, but if the data are a bit older, it is not a problem.
>>
>>      Many thanks in advance for any help with this!
>>
>>      Laure Gardelle
>>
>>      _______________________________________________
>>      UNSUBSCRIBE from this page:
>>      http://mailman.uib.no/options/corpora
>>      Corpora mailing list
>>      Corpora at uib.no
>>      http://mailman.uib.no/listinfo/corpora
>>
>>
>>
>>
>> --
>> ========================================
>> Adam Kilgarriff                  adam at lexmasterclass.com                   
>>                          
>> Director                                    Lexical Computing Ltd          
>>      
>> Visiting Research Fellow                 University of Leeds      Corpora
>> for all with the Sketch Engine                 
>>                         DANTE: a lexical database for English              
>>     ========================================
>>
>>
>
> -- 
> Eric Atwell, Associate Professor, Language research group,
>  I-AIBS Institute for Artificial Intelligence and Biological Systems
>  School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
>  Leeds LS2 9JT, England.        TEL: 0113-3435430  FAX: 0113-3435468
>  WWW: http://www.comp.leeds.ac.uk/eric
>       http://www.comp.leeds.ac.uk/nlp
>       http://www.comp.leeds.ac.uk/arabic
>


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list