[Corpora-List] Query: Corpora of American and British English that can be compared?
Eric Atwell
E.S.Atwell at leeds.ac.uk
Thu Dec 20 11:38:43 UTC 2012
I agree with Adam and True Friend: LOB v Brown are the long-standing
established corpora to compare UK v US English, from 1960s.
BUT you asked for " ... sufficiently close collection procedures
for the hits they return to be compared ..." whcih suggests you really
want web-as-corpus collections gathered more recently by web-crawlers?
If so: World Wide English Corpus
http://www.comp.leeds.ac.uk/eric/wwe.shtml
includes 2M-word samples of UK English and US English, collected
using SketchEngine's WebBootCat web-as-corpus harvester,
for student exercises in comparing world varieties of English
Eric Atwell, Leeds University
On Thu, 20 Dec 2012, Adam Kilgarriff wrote:
> Dear Laure,
> the straightforward answer is the 'Brown family' corpora - Brown and LOB
> were compiled with just this kind of analysis in mind: they were both 1961
> and more comparable data points are available for 1991 (FROWN and FLOB) and
> (tho maybe this is British Englsih only) 1931, 1901 and 2006.
>
> You can do the comparisons easily and directly in the Sketch Engine, where
> the data is already set up (includiung POS-tagged) and the 'Brown family'
> corpus contains all the above except the 1901 part.
>
> Regards
>
> Adam
>
> On 18 December 2012 09:23, Laure Gardelle <laure.gardelle at ens-lyon.fr>
> wrote:
> Dear colleagues,
>
> For my research I need to compare one set of agreement patterns
> in American and British English.
> So would anyone know of two corpora (one for American English,
> the other for British English) that would have sufficiently
> close collection procedures for the hits they return to be
> compared (ie. for possible differences in proportion to be
> considered meaningful)?? Ideally I am looking for contemporary
> English, but if the data are a bit older, it is not a problem.
>
> Many thanks in advance for any help with this!
>
> Laure Gardelle
>
> _______________________________________________
> UNSUBSCRIBE from this page:
> http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>
>
>
> --
> ========================================
> Adam Kilgarriff adam at lexmasterclass.com
>
> Director Lexical Computing Ltd
>
> Visiting Research Fellow University of Leeds Corpora
> for all with the Sketch Engine
> DANTE: a lexical database for English
> ========================================
>
>
--
Eric Atwell, Associate Professor, Language research group,
I-AIBS Institute for Artificial Intelligence and Biological Systems
School of Computing, Faculty of Engineering, UNIVERSITY OF LEEDS
Leeds LS2 9JT, England. TEL: 0113-3435430 FAX: 0113-3435468
WWW: http://www.comp.leeds.ac.uk/eric
http://www.comp.leeds.ac.uk/nlp
http://www.comp.leeds.ac.uk/arabic
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list