[Corpora-List] grapheme-to-phoneme mapping
Simon King
Simon.King at ed.ac.uk
Fri Aug 19 09:13:50 UTC 2005
n.chipere at reading.ac.uk wrote:
> Dear all
>
> I am looking for a word list that specifies direct grapheme-phoneme
> mappings. The lists I'm familiar with, eg. CMU Pronunciation Dictionary; MRC
> Psycholinguistic Database; Moby Pronunciator and the Oxford Learner's
> Dictionary all provide phonetic transcriptions for entire words but do not
> indicate which particular grapheme(s) correspond(s) to which particular
> phoneme(s).
The letter-to-sound decision tree from Festival can do this - you
provide a letter, plus some left and right context letters, and it
predicts the zero or more phoneme(s) that it maps to.
http://www.cstr.ed.ac.uk/projects/festival/
To train the model, the letters and phonemes in an existing dictionary
are aligned using dynamic programming (after the mapping is hand-seeded,
I think) - see
Alan W Black, Kevin Lenzo, and Vincent Pagel. Issues in building general
letter to sound rules. In The Third ESCA Workshop in Speech Synthesis,
pages 77-80, 1998.
available from
http://www.cstr.ed.ac.uk/publications/
I believe this model is as accurate as anything else available, but will
of course have a much higher error rate than a proper lexicon (e.g. our
Unisyn lexicon http://www.cstr.ed.ac.uk/projects/unisyn/)
Simon
--
Dr. Simon King Simon.King at ed.ac.uk
Centre for Speech Technology Research www.cstr.ed.ac.uk
For MSc/PhD info, visit www.hcrc.ed.ac.uk/language-at-edinburgh
More information about the Corpora
mailing list