[Corpora-List] Spellchecker evaluation corpus

Stefan Bordag sbordag at informatik.uni-leipzig.de
Sat Apr 9 08:45:00 UTC 2011


Hi everyone,

It seems like for every conceivable NLP task there is some agreed-upon 
evaluation data set. Or at least one that is used in at least several 
papers. Now, for some strange reason I seem to be utterly unable to find 
any such test set for the spell checking task!

Am I doing something wrong or is there no such data set? I know I can 
make synthetic tests systematically inserting, swapping etc. letters in 
my own test data, but this would give me results which I cannot compare 
to any other results. Hence, is there some accepted evaluation forum 
which I am missing because whenever I include spell check in any form in 
search queries I get lots of tutorials how to write a spellchecker and 
almost nothing else...

Best regards,
Stefan Bordag

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list