[Corpora-List] Request for advice on creating a learners' corpus

Victoria Muehleisen vicky at waseda.jp
Wed Jan 26 16:27:17 UTC 2005


Hello Everyone,

I teach English at a university in Japan, and we recently received some
grant money to set up a learners' corpus, of students' essays written
in English.

Although we have some ideas of how we can begin doing research once we
have the corpus, we don't know anything about actually setting it up.
What are the best formats for storing the essays?  For marking up the
data?  What kind of information will be most useful to add to the
files? (For example, we know that we'll want to identify the level of
the class the essay was written for--there are basic, intermediate, and
advanced level writing courses--and we'll also want to code for the
native language of the writer--not all the studehts are Japanese--but
are there other kinds of variables we should keep track of?)

We would appreciate references to books/articles/web sites on setting
up a learners' corpus, especially ones that don't assume too much
technical computer knowledge.  We'll have people available to help up
with the technical side, but we need to tell them what we want to do.

In additional to references, if there is anyone who has created a
learners' corpus and could warn us about any mistakes to avoid, that
would also be very helpful.  And at the next stage, we'll need to start
thinking about issues of student privacy/permission, so any references
on those issues (in particular, ways that other corpus-creators have
done it) would be very useful.

Thanking you in advance,

*********************************
Victoria Muehleisen

School of International Liberal Studies Waseda University
Nishi-Waseda 1-6-1
Shinjuku-ku, Tokyo 169-8050

E-mail: <vicky at waseda.jp>
Home page: <www.f.waseda.jp/vicky>



More information about the Corpora mailing list