[Corpora-List] Machine Translation and Spelling Correction

Jennifer Pedler jenny at dcs.bbk.ac.uk
Thu Dec 3 16:55:37 UTC 2009


Hi Nicola,

A few people have already pointed you to my colleague Roger Mitton's work. I have also made available the corpus of dyslexic real-word spelling errors (marked up with corrections) that I used for my PhD. You can download that and also a copy of my PhD thesis on real-word error correction from my webpage here: http://www.dcs.bbk.ac.uk/~jenny/resources.html

Hope that may be useful for you.

All the best,

Jenny
 


--------------------------------------------
Dr Jennifer Pedler
Programme Manager FdIT - Workplace Liaison
Dept of Computer Science & Information Systems
Birkbeck, London University
Malet St
London WC1E 7HX
020 7079 0720

-----Original Message-----
From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Nicola Bertoldi
Sent: 03 December 2009 15:04
To: corpora at uib.no
Subject: [Corpora-List] Machine Translation and Spelling Correction

I send again this message with a more appropriate heading.
Sorry for the inconvenience.



I am going to do some investigation to improve machine translation when it is applied to texts corrupted by misspellings of any sort (non-word, real-word errors).

In this preliminary phase I am collecting information about the spelling correction task and other applications and tasks which involves spelling correction.

In particular, I am interested in
- surveys about the task
- statistics about the most common misspellings in texts of different languages and different genres
- public available software for spelling correction
- available corpora of noisy texts
- any further resources which is possibly useful for my topic



Thanks!

Nicola

------ End of Forwarded Message

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list