[Corpora-List] Machine Translation and Spelling Correction
Jennifer Pedler
jenny at dcs.bbk.ac.uk
Thu Dec 3 16:55:37 UTC 2009
Hi Nicola,
A few people have already pointed you to my colleague Roger Mitton's work. I have also made available the corpus of dyslexic real-word spelling errors (marked up with corrections) that I used for my PhD. You can download that and also a copy of my PhD thesis on real-word error correction from my webpage here: http://www.dcs.bbk.ac.uk/~jenny/resources.html
Hope that may be useful for you.
All the best,
Jenny
--------------------------------------------
Dr Jennifer Pedler
Programme Manager FdIT - Workplace Liaison
Dept of Computer Science & Information Systems
Birkbeck, London University
Malet St
London WC1E 7HX
020 7079 0720
-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Nicola Bertoldi
Sent: 03 December 2009 15:04
To: corpora at uib.no
Subject: [Corpora-List] Machine Translation and Spelling Correction
I send again this message with a more appropriate heading.
Sorry for the inconvenience.
I am going to do some investigation to improve machine translation when it is applied to texts corrupted by misspellings of any sort (non-word, real-word errors).
In this preliminary phase I am collecting information about the spelling correction task and other applications and tasks which involves spelling correction.
In particular, I am interested in
- surveys about the task
- statistics about the most common misspellings in texts of different languages and different genres
- public available software for spelling correction
- available corpora of noisy texts
- any further resources which is possibly useful for my topic
Thanks!
Nicola
------ End of Forwarded Message
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list