[Corpora-List] Call For Participation: CLP-2014 Bakeoff on Chinese Spelling Check
Lung-Hao Lee
lunghaolee at gmail.com
Tue Apr 15 16:11:04 UTC 2014
(Apologies for cross-posting)
We organize an international bakeoff on *Chinese Spelling Check* in *CLP-2014
(Oct. 20-21 in Wuhan, China)*, which is the 3rd conference jointly
organized by the Chinese Language Processing Society of China (*CIPS*) and
the ACL Special Interest Group on Chinese Language Processing (*SIGHAN)*.
You are welcome to participate our task.
For more information, kindly visit
http://ir.itc.ntnu.edu.tw/clp2014/task2csc.html
*Introduction*
The number of people learning Chinese as a Foreign Language (CFL) is
booming in recent decades. This number is expected to become even larger
for the years to come. However, unlike English learning environment where
many learning techniques have been developed, tools to support CFL learners
are relatively rare, especially those that could automatically detect and
correct Chinese spelling and grammatical errors. For example, Microsoft
Word has not yet supported these functions for Chinese, although it
supports English for years. In this bakeoff, essays written by CFL learners
were collected for developing automatic spelling checkers. The hope is that
through such evaluation campaigns, more innovative computer-assisted
techniques will emerge, more effective Chinese learning resources will be
built, and the state-of-art NLP techniques will be advanced for the
educational applications.
*Task Description*
The goal of this task is to evaluate the capability of a Chinese spelling
checker. The passage consisting of several sentences with/without spelling
errors will be given as the input. The checker should return the locations
of incorrect characters and suggest the correct characters. Each character
or punctuation occupies one position for counting location. If the input
contains no spelling errors, the system should return “*pid, 0*”. If the
input contains at least one spelling errors, the output format is “*pid [,
location, correction]+*”.
*Data Sets *
The policy of our evaluation is an open test. Participants can employ any
linguistic and computational resources to develop your spelling checker.
For example, the datasets with gold standard annotation for spelling check
bakeoff last year can be freely downloaded at
http://ir.itc.ntnu.edu.tw/lre/sighan7csc.html for your reference. This
year, we also provide passages of CFLs’ essays selected from the NTNU
learner corpus for training purpose. The data will be released in SGML
format shown as follows. In addition, at least 1000 testing passages
selected to cover different complexities will be used for testing.
*Important Dates*
- Registration for Bakeoffs open: *2014-03-20*
- Training data released: *2014-05-01*
- Dry run (format validation): *2014-05-20*
- Registration for Bakeoffs close: *2014-06-30*
- Test data released: *2014-07-30 (18:00 Beijing Time)*
- Test result submission deadline: *2014-08-01 (18:00 Beijing Time)*
- Test result evaluation released: *2014-08-20*
- Evaluation report submission deadline: *2014-08-26*
- Evaluation report reviews return: *2014-09-01*
- Final evaluation report submission deadline: *2014-09-10*
- Main Conference: *2014-10-20/21 <2014-10-20%2F21>*
On behalf of co-organizers
Liang-Chih Yu, Lung-Hao Lee, Yuen-Hsien Tseng, and Hsin-Hsi Chen
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20140416/c26fcd75/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list