[Corpora-List] 'Standard European English' ?
Yorick Wilks
yorick at dcs.shef.ac.uk
Fri Mar 3 18:06:23 UTC 2006
Fair enough--I see that generalizations across the subsets could be
interesting provided they came from radically different language
families. And of course, I had forgotten the security interest in the
general corpus: testing whether a text/speaker could be pulled out or
diagnosed as having one among possible native origins---a holmes/
poirot sort of speciality.
YW
On 3 Mar 2006, at 10:57, Somers, Harold wrote:
>> One version of this discussion was had a few years ago when
>> it was seriously proposed---I forget who by--to create a
>> corpus of "non- native English"; not a corpus of specific
>> Englishes from specific non-native groups (e.g. so as to
>> grammar/spell correct the English of French speakers, for
>> example, a useful and real task)---but rather some general
>> corpus. I think the proposal collapsed under the
>> ridiculousness of the idea. I do hope so and that its not out
>> there somewhere waiting for users!
>>
>
> You're not referring perhaps to the ever-growing corpus of Learner
> English collected by Sylvie Granger and colleagues at Louvain? Not so
> ridiculous an idea for people interested in EFL, I think. That
> corpus is
> collected specifically as a source of errors, and each text is of
> course
> identified by the native language of the source, among other
> things. So
> it is simultaneously "a corpus of specific Englishes from specific
> non-native groups" and a more "general corpus", so I'm not sure if you
> would think it ridiculous or worthy. In any case, what IS clear, from
> the perspective of both teaching EFL and correcting non-native English
> (we and others called it "interference checking" when we worked on it
> many years ago!) is that some learners' errors are due to specific
> interference from the native language, and some are more generic,
> perhaps due to particular idiosyncrasies or irregularities of English,
> no matter what the learner's native language. SO a generic corpus of
> learner English would help to identify the latter.
>
>
>
More information about the Corpora
mailing list