[Corpora-List] Request for advice on creating a learners' corpus
Victoria Muehleisen
vicky at waseda.jp
Wed Jan 26 15:58:00 UTC 2005
Hello Everyone,
I teach English at a university in Japan, and we recently received some grant money to set up a learners' corpus, of students' essays written in English.
Although we have some ideas of how we can begin doing research once we have the corpus, we don't know anything about actually setting it up. What are the best formats for storing the essays? For marking up the data? What kind of information will be most useful to add to the files? (For example, we know that we'll want to identify the level of the class the essay was written for--there are basic, intermediate, and advanced level writing courses--and we'll also want to code for the native language of the writer--not all the studehts are Japanese--but are there other kinds of variables we should keep track of?)
We would appreciate references to books/articles/web sites on setting up a learners' corpus, especially ones that don't assume too much technical computer knowledge. We'll have people available to help up with the technical side, but we need to tell them what we want to do.
In additional to references, if there is anyone who has created a learners' corpus and could warn us about any mistakes to avoid, that would also be very helpful. And at the next stage, we'll need to start thinking about issues of student privacy/permission, so any references on those issues (in particular, ways that other corpus-creators have done it) would be very useful.
Thanking you in advance,
*********************************
Victoria Muehleisen
School of International Liberal Studies Waseda University
Nishi-Waseda 1-6-1
Shinjuku-ku, Tokyo 169-8050
E-mail: <vicky at waseda.jp>
Home page: <www.f.waseda.jp/vicky>
More information about the Corpora
mailing list