[Corpora-List] Chinese spelling bakeoff

Simon Smith smithsgj at gmail.com
Wed Apr 16 10:18:49 UTC 2014


> *Task Description*
> The goal of this task is to evaluate the capability of a Chinese spelling
> checker. The passage consisting of several sentences with/without spelling
> errors will be given as the input. The checker should return the locations
> of incorrect characters and suggest the correct characters. Each character
> or punctuation occupies one position for counting location. If the input
> contains no spelling errors, the system should return ?*pid, 0*?. If the
> input contains at least one spelling errors, the output format is ?*pid [,
> location, correction]+*?.


Chinese doesn't have "spelling" as such, so I'm trying to figure out
what you are saying correct spelling in an alphabetic language
corresponds to in Chinese. For me, the closest analogy would mean
writing the character correctly: no strokes missing, or other
compositional errors.

That can't be what you mean, though, since you're looking at
electronic input. In the essays, the characters cannot possibly have
missing strokes or compositional errors; the errors can only be in the
choice of character. If a student writes pengyou using youmeiyou de
you instead of pengyou de you, for example, is that a spelling error,
since the phonetic realization of the correct and incorrect characters
is the same? Or, if someone wrote yueliang de yue instead of peng,
replacing the correct character with one that *looks* like it, would
that count?

Or is that any incorrect character counts as a spelling mistake? But
that's not a "spelling" issue, is it?

(Does the last quoted line ( ?*pid) above show an example error in
Chinese? I don't think Chinese characters show up properly on corpora
list...)
___________________________


Simon Smith, PhD
Senior Lecturer
Dept of English & Languages
Coventry University

+44 2476 887 643

http://www.linkedin.com/pub/simon-smith/42/b77/173

http://tinyurl.com/simoncov

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list