<div dir="ltr">(apologies for cross-posting and any inadvertent grammatical errors)<div><br><br><div style="text-align:center"><b>COLING 2014 Tutorial on Automated Grammatical Error Correction for Language Learners</b></div>
<div style="text-align:center">Sunday, August 24, 2014</div><div><div style="text-align:center">Dublin, Ireland</div><div style="text-align:center"><a href="http://www.coling-2014.org/tutorials.php">http://www.coling-2014.org/tutorials.php</a><br>
</div><div><br></div><div><br></div><div><b>TUTORIAL DESCRIPTION</b><br><br>A fast growing area in Natural Language Processing is the use of automated tools for identifying and correcting grammatical errors made by language learners. This growth, in part, has been fueled by the needs of a large number of people in the world who are learning and using a second or foreign language. For example, it is estimated that there are currently over one billion people who are non-native speakers of English. These numbers drive the demand for accurate tools that can help learners to write and speak proficiently in another language. Such demand also makes this an exciting time for those in the NLP community who are developing automated methods for grammatical error correction (GEC). Our motivation for the COLING tutorial is to make others more aware of this field and its particular set of challenges. For these reasons, we believe that the tutorial will potentially benefit a broad range of conference attendees.<br>
<br>In general, there has been a surge in interest in using NLP to address educational needs, which, in turn, has spawned the recurring ACL/NAACL workshop “Innovative Use of Natural Language Processing for Building Educational Applications” that will have its 9th edition at ACL 2014. The last three years, in particular, have been pivotal for GEC. Papers on the topic have become more commonplace at main conferences such as ACL, NAACL and EMNLP, as well as two editions of a Morgan Claypool Synthesis Series book on the topic (Leacock et al., 2014). In 2011 and 2012, the first shared tasks in GEC (Dale et al., 2011; Dale et al., 2012) were created, and dozens of teams from all over the world participated. This was followed by two successful CoNLL Shared Tasks on the topic in 2013 and 2014 (Ng et al., 2013).</div>
<div> <br></div><div>While there have been many exciting developments in GEC over the last few years, there is still considerable room for improvement as state-of-the-art performance in detecting and correcting several important error types is still inadequate for real world applications. We hope to engage researchers from other NLP fields to develop novel and effective approaches to these problems. Our tutorial is specifically designed to:</div>
<div><br></div><div>* Introduce an NLP audience to the challenges that language learners face and thus the challenges of designing NLP tools to assist in language acquisition<br><br>* Provide a history of GEC and the state-of-the-art approaches for different error types<br>
<br>* Show the need for cross-lingual error correction approaches and discuss novel methods for achieving this<br><br>* Discuss ways in which error correction techniques can have an impact on other NLP tasks<br><br><br>References<br>
<br>Claudia Leacock, Martin Chodorow, Michael Gamon and Joel Tetreault. 2014. Automated Grammatical Error Detection for Language Learners, Second Edition. Synthesis Lectures on Human Language Technologies. Morgan & Claypool Publishers.<br>
<br>Robert Dale and Adam Kilgarriff. Helping Our Own: The HOO 2011 pilot shared task. In Proceedings of the 13th European Workshop on Natural Language Generation. Nancy, France, September 2011.<br><br>Robert Dale, Ilya Anisimoff and George Narroway. HOO 2012: A report on the preposition and determiner error correction shared task. In Proceedings of the Seventh Workshop on Building Educational Applications Using NLP. Montreal, Canada, June 2012<br>
<br>Hwee Tou Ng, Siew Mei Wu, Yuanbin Wu, and Joel Tetreault. The Conll-2013 shared task on grammatical error correction. In Proceedings of the Seventeenth Conference on Computational Natural Language Learning, Sofia, Bulgaria, August 2013.</div>
<div><br><br><b>TUTORIAL OUTLINE</b><br><br>1. Introduction<br><br></div><div>2. Special Problems of Language Learners</div><div>* Errors made by English Language Learners (ELLs)<br>* Influence of L1<br><br></div><div>3. Heuristic rule-based approaches<br>
<br>4. Data Driven Approaches to Error Correction<br>* Methods for detection and correction<br>* Types of training data<br>* Features<br>* Web-based methods<br><br>5. Annotation and Evaluation<br>* Annotation schemes<br>* Proposals for efficient annotation<br>
* Evaluation Measures<br>* Crowdsourcing for annotation and evaluation<br><br>6. Current Trends in Error Correction<br>* Detection of ungrammatical sentences and Other error types<br>* Shared tasks</div><div>* Going beyond the classification methodology<br>
* Error correction in other languages<br><br>7. Conclusions<br><br> <br><br><b>TUTORIAL ORGANIZERS</b><br><br><b>Joel Tetreault</b> is a Senior Research Scientist at Yahoo! Labs in New York City. His research focus is Natural Language Processing with specific interests in anaphora, dialogue and discourse processing, machine learning, and applying these techniques to the analysis of English language learning and automated essay scoring. Previously he was Principal Manager of the Core Natural Language group at Nuance Communications, Inc. where he worked on the research and development of NLP tools and components for the next generation of intelligent dialogue systems. Prior to Nuance, he worked at Educational Testing Service for six years as a Managing Senior Research Scientist where he researched automated methods for detecting grammatical errors by non-native speakers, plagiarism detection, and content scoring. Tetreault received his B.A. in Computer Science from Harvard University (1998) and his M.S. and Ph.D. in Computer Science from the University of Rochester (2004). He was also a postdoctoral research scientist at the University of Pittsburgh's Learning Research and Development Center (2004-2007), where he worked on developing spoken dialogue tutoring systems. In addition he has co-organized the Building Educational Application workshop series for 7 years, the CoNLL 2013 Shared Task on Grammatical Error Correction, and is currently NAACL Treasurer.<br>
<br><br><b>Claudia Leacock</b> is a Research Scientist at CTB McGraw-Hill who has been working on using NLP in educational applications for 20 years focusing on automated scoring and grammatical error detection. She was previously a consultant for Microsoft Research where she collaborated on the development of ESL Assistant: a web-based prototype tool for detecting and correcting grammatical errors of English language learners. As a Distinguished Member of Technical Staff at Pearson Knowledge Technologies, and previously as a Principal Development Scientist at Educational Testing Service, she developed tools for automated assessment of short-response content-based questions and for grammatical error detection. As a member of the WordNet group at Princeton University’s Cognitive Science Lab, her research focused on word sense disambiguation. Dr. Leacock received a B.A. in English from NYU, a Ph.D. in Linguistics from the City University of New York, Graduate Center and was a post-doctoral fellow at IBM, T.J. Watson Research Center. <br>
<br></div></div></div></div>