[Corpora-List] error tagging

Hitoshi Isahara isahara at crl.go.jp
Fri Sep 26 18:22:42 UTC 2003


Dear Tim,
How are you?
Thank you very much for mentioning our SST corpus.

Dear Belen,
If you are interested in the SST corpus, please
contact Ms. Emi Izumi (emi at crl.go.jp) and me.

Best,

Hitoshi Isahara (isahara at crl.go.jp)
Leader of the Computational Linguistics Group
Communications Research Laboratory, Japan


At Fri, 26 Sep 2003 10:49:31 -0700 (PDT),
Timothy Baldwin <tbaldwin at csli.stanford.edu> wrote:
>
> > I am interested in error tagging and I am looking for corpora which are (or are being) error tagged. Do you know of any? And do you know of any available error tagset?
>
> One more recent effort I know of is the SST Corpus, which is a 1m word corpus
> of transcribed English speech by Japanese learners of English. Various errors
> are tagged, although I can't find any online account of the full tagset. There
> are a couple of papers in English on the corpus, notably:
>
> Tono, Y., Kaneko, T., Isahara, H., Saiga, T. and Izumi, E.  The Standard
> Speaking Test (SST) Corpus: A 1 million-word spoken corpus of Japanese
> learners of English and its implications for L2 lexicography. Lee, S. (ed.)
> ASIALEX 2001 Proceedings: Asian Bilingualism and the Dictionary. The Second
> Asialex International Congress, August 8-10, 2001, Yonsei University, Korea,
> pp. 257-262
>
> There is a web page with some documentation and a copy of this paper at:
>
> http://leo.meikai.ac.jp/~tono/sst/
>
> There was also a paper at this year's ACL:
>
> Emi Izumi, Kiyotaka Uchimoto, Toyomi Saiga, Thepchai Supnithi and Hitoshi
> Isahara (2003) Automatic error detection in the Japanese learners' English
> spoken data. In Companion Volume to the Proceedings of the 41st Annual Meeting
> of the Association for Computational Linguistics (ACL '03), pp. 145-8.
>
> which is also available online at:
>
> http://acl.ldc.upenn.edu/acl2003/posterdemo/pdf/Izumi.pdf
>
>
>
> Tim
>
> *-----------------------------------*
>
> Timothy Baldwin
> Senior research engineer
> Multiword Expression project
> CSLI LinGO Lab
>
>
> Contact details:
>
>  Email: tbaldwin at csli.stanford.edu
> Tel:   (+1)-650-723-0515
> Fax:   (+1)-650-723-2166
>
> *-----------------------------------*
>



More information about the Corpora mailing list