[Corpora-List] Learner corpora build & query tool?

Pascual Pérez-Paredes pascualf at um.es
Tue Feb 24 16:57:04 UTC 2009


 Dear Simon,

Although not an answer to your question, you might want to have a look at
SACODEYL, a EU-funded initiative to collect and distribute English, French,
German, Italian, Lithuanian, Romanian, and Spanish teen talk.

We developed our own transcription, annotation and search tools, all of them
freeware and open source.

You can read more here:

P¨¦rez-Paredes, P. & Alcaraz, J.M. (2009). Developing annotation solutions
for online Data Driven Learning. ReCall, 21,1, 55-75.

and of course visit our website:

http://www.um.es/sacodeyl

and search tool:

http://sacodeyl.inf.um.es/sacodeyl-search2/

Best.

Pascual P¨¦rez-Paredes




On Tue, Feb 24, 2009 at 6:19 AM, simon smith <smithsgj at nccu.edu.tw> wrote:
>>
>> I've been looking over the resources recommended to Mieke van der Velden
on the list with considerable interest.
>>
>> Here at NCCU in Taiwan, we have 8 language departments -- English,
French, German, Korean, Japanese, Spanish, Arabic, Turkish -- and we plan to
build a learner corpus for each. Although this sounds like an ambitious
scheme, it has support and funding from the central university
administration.
>>
>> The people studying these languages, here in Taiwan, are native speakers
of Chinese. I'm aware of Chinese speaker learner corpora of some of the
languages: English obviously, Spanish and Japanese (and German planned) at
National Chengkung University. But I'm interested to know if any of our
planned corpora will be firsts. It seems pretty unlikely that there exists a
Chinese speaker LC of Turkish, for example. So if you are reading this, and
you know of an existing Chinese speaker LC of one of our languages, perhaps
you could let me know.
>>
>> It's a longish-term project, and we're not too clear at the moment what
sort of interlanguage annotation or correction we'll be doing. Right now,
the important thing is to start collecting data. We could probably create
our own interface to do this, but I wonder if there is a (free or shareware)
product out there that we could use for LC building.
>>
>> It would need to be pretty straightforward to use, because the language
teachers collaborating will have no experience of corpora or corpus
linguistics. Some of them will, indeed, have very little computer experience
at all.
>>
>> Ideally, we would collect the data (as homework assignments) directly
from students. I'm wondering about the possibility of using Moodle for this,
either the Database or Wiki modules ( there is a Corpus module but it's not
supported any more). The students would input their data, and everyone would
be able to see it. In the Wiki, we could allow teachers to edit it, and a
record of changes would be kept.
>>
>> But I'm not how easy it would be to do annotation of a "corpus" in that
format, or really analyse it in a conventional way. There would be no
obvious way of generating a concordance, for example.
>>
>> I really like the idea of a shared resource which can be built, updated,
consulted and used by learners, all via the same interface.
>>
>> Any thoughts anyone?
>>
>> šgÓ­ÒÔÖÐÎÄ»ØÐÅ
>>
>> Simon Smith, PhD
>>
>> Assistant Professor
>> Foreign Language Center
>> National Chengchi University
>>
>> office: Research Building 416
>> phone:  (0)2 2939 3091  x 88015
>> fax  +44 (0)871 243 1512
>>
>>
>> _______________________________________________
>> Corpora mailing list
>> Corpora at uib.no
>> http://mailman.uib.no/listinfo/corpora
>>
>
>
>
> --
> Dr. Pascual P¨¦rez-Paredes
> Traductor Oficial / Int¨¦rprete Jurado M.A.A.E.E.
> Departamento de Filolog¨ªa Inglesa
> Campus de la Merced
> Universidad de Murcia
> 30071 MURCIA
> SPAIN
> SKYPE ID pascual.perez.paredes
> TEL 34 968364378
> FAX 34 968363185
> http://www.um.es/dp-filologia-inglesa/paredes/
> ........................................
> http://um.academia.edu/PascualP¨¦rezParedes<http://um.academia.edu/PascualP%C3%A9rezParedes>
> .......................................
>



--
Dr. Pascual P¨¦rez-Paredes
Traductor Oficial / Int¨¦rprete Jurado M.A.A.E.E.
Departamento de Filolog¨ªa Inglesa
Campus de la Merced
Universidad de Murcia
30071 MURCIA
SPAIN
SKYPE ID pascual.perez.paredes
TEL 34 968364378
FAX 34 968363185
http://www.um.es/dp-filologia-inglesa/paredes/
........................................
http://um.academia.edu/PascualP¨¦rezParedes<http://um.academia.edu/PascualP%C3%A9rezParedes>
.......................................
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20090224/bb6e7542/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list