[Corpora-List] grapheme-to-phoneme mapping

Simon King Simon.King at ed.ac.uk
Fri Aug 19 09:13:50 UTC 2005


n.chipere at reading.ac.uk wrote:
> Dear all
> 
> I am looking for a word list that specifies direct grapheme-phoneme
> mappings. The lists I'm familiar with, eg. CMU Pronunciation Dictionary; MRC
> Psycholinguistic Database; Moby Pronunciator and the Oxford Learner's
> Dictionary all provide phonetic transcriptions for entire words but do not
> indicate which particular grapheme(s) correspond(s) to which particular
> phoneme(s). 

The letter-to-sound decision tree from Festival can do this - you
provide a letter, plus some left and right context letters, and it
predicts the zero or more phoneme(s) that it maps to.

http://www.cstr.ed.ac.uk/projects/festival/

To train the model, the letters and phonemes in an existing dictionary
are aligned using dynamic programming (after the mapping is hand-seeded,
I think) - see

Alan W Black, Kevin Lenzo, and Vincent Pagel. Issues in building general
letter to sound rules. In The Third ESCA Workshop in Speech Synthesis,
pages 77-80, 1998.

available from

http://www.cstr.ed.ac.uk/publications/


I believe this model is as accurate as anything else available, but will
of course have a much higher error rate than a proper lexicon (e.g. our
Unisyn lexicon http://www.cstr.ed.ac.uk/projects/unisyn/)

Simon

-- 
Dr. Simon King                               Simon.King at ed.ac.uk
Centre for Speech Technology Research          www.cstr.ed.ac.uk
For MSc/PhD info, visit  www.hcrc.ed.ac.uk/language-at-edinburgh



More information about the Corpora mailing list