[Corpora-List] Summary: Studies of spelling error frequency in journalistic text
Paul McNamee
paul.mcnamee at jhuapl.edu
Thu Sep 4 20:08:39 UTC 2008
I only received responses from Paul Rayson and Alistair Baron of Lancaster
University regarding my interest in studies of spelling error frequency
in journalistic text. Here's the summary.
They pointed me to two papers.
The paper by Church and Gale (1991) stated a rate equivalent to 1
error per 2000 words for the Associated Press but they did not give a
citation for that figure.
This is not news, but the paper by Mitton (as does his book) reports
a rate of 25 errors per thousand for handwritten essays by secondary
school students, though there was considerable variation between good
spellers and poor spellers.
Here are the papers:
@Article{Church:1991,
title = "Probability Scoring for Spelling Correction",
author = "Kenneth Ward Church and William A. Gale",
journal = "Statistics and Computing",
year = "1991",
number = "2",
volume = "1",
pages = "93--103",
}
@Article{Mitton:1987,
title = "Spelling Checkers, Spelling Correctors and the Misspellings of Poor Spellers",
author = "Roger Mitton",
journal = "Information Processing and Management",
year = "1987",
volume = "23",
number = "5",
pages = "495-505",
}
- Paul
On Thu, 28 Aug 2008, Paul McNamee wrote:
> I am looking for references to studies reporting spelling error rates
> in modern journalistic text. Say for newspapers with large
> circulations. Ideally I'm looking for a number, such as "1 in 1000
> words", but breakdowns by type of error (e.g., for proper/common
> nouns, etc...) would be a bonus.
>
> I'm less interested, but would still be curious to know of any studies
> that have either: (1) compared spelling error frequency across textual
> styles (including news); or, (2) compared rates across languages with
> differing morphological complexity.
>
> Thanks!
>
> - Paul
>
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list