[Corpora-List] HOO: A New Shared Task in Text Correction

Robert Dale robert.dale at mq.edu.au
Thu Apr 14 10:30:16 UTC 2011


We're very pleased to announce a new shared task in automated text
correction.

With an increasing number of papers in Natural Language Processing and
Computational Linguistics being authored by non-native English speakers
(NNSs), we think it's time the community provided more support for those
authors. As a field that works on computational techniques for processing
text, we're in a better position than most to do something useful; so, the
aim of this shared task-called HOO, for 'Helping Our Own'-is to promote the
use of NLP tools and techniques to help improve the textual quality of
papers written by NNSs in the field.   Of course, we're not being rigidly
inward-looking-techniques developed here will be useful for authors in other
disciplines too; but we figure that this approach will get us maximum
traction.

The task offers opportunities for researchers working in a wide range of NLP
areas:  spell checking, grammar checking, style checking, paraphrasing,
machine translation, text compression, text simplification ... the
possibilities are endless, with techniques developed for quite different
purposes having potential to assist.  Participating teams can choose to
focus on specific subsets of errors and corrections, or to try to achieve
universal repair.

The initial development data set is under construction, and will be released
soon; this consists of 1000-word excerpts of text from real papers that have
been graciously contributed to the project by their authors, each
subsequently marked-up with corrections. An initial sample paper that gives
a flavour of the kinds of corrections we're dealing with and the way in
which they are marked up is available from the HOO website at
http://www.clt.mq.edu.au/research/projects/hoo/.  Please visit the site to
register your interest and to be added to a mailing list for project
updates.

More information about the aims of the project can be found in the following
paper:

R Dale and A Kilgarriff [2010] Helping Our Own: Text Massaging for
Computational Linguistics as a New Shared Task. In Proceedings of the 6th
International Natural Language Generation Conference, 7th-9th July 2010,
Dublin, Ireland.
[http://www.clt.mq.edu.au/research/projects/hoo/files/2010_DaleKilg_INLG_HOO
.pdf]

The schedule for this initial pilot run of the shared task is as follows:

14 April : HOO launched 
30 April: Announce and public release of scripts and development data 
10 June: Evaluation data available to participants 
10 July:  Latest date for return of corrected scripts 
31 July: Announce results 
Aug, Sept:  Participants prepare their system descriptions and error
analyses 
28-30 Sept: Workshop (with ENLG, Nancy, France)

We look forward to your participation in this exercise!

Robert Dale and Adam Kilgarriff





_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list