[Corpora-List] pronunciation lexica (SUMMARY)
James Magnuson
magnuson at paradox.psych.columbia.edu
Wed Sep 11 19:24:55 UTC 2002
Thank you to everyone who replied about my email last week regarding
pronuncations. Here is a summary of replies.
1. Regarding freely available lists of prounciations, there were two
- The CMU pronunciation dictionary (about 125k words of N.American
see http://www.speech.cs.cmu.edu/cgi-bin/cmudict
- The Moby lexicon (175k entries):
see http://www.speech.cs.cmu.edu/comp.speech/Section1/Lexical/moby.html
2. Three possibilities that require licensing of some sort:
- Cambridge Dictionaries "English Pronouncing Dictionary" with
British and American pronunciations of 120k forms
see http://dictionary.cambridge.org/researchers.htm
- Oxford University Press has various US English pronunciation
resources with over 165,000 headwords and over 255,000 wordforms
see http://www.oup.co.uk/digital_reference
- PRONLEX, a pronuncation lexicon developed as part of the
CALLHOME project, with 90,988 lexical entries
see http://www.ldc.upenn.edu/Catalog/LDC97L20.html
3. Regarding my mystery lexica:
- Moby is, well, Moby (see #1)
- MIT was developed from the 1964 Webster's Pocket dictionary.
There is a description in:
Shipman, D.W. and Zue, V.W (1982) "Properties of Large Lexicons:
Implications for Advanced Isolated Word Recognition Systems", Conference
Record, IEEE International Conference on Speech Acoustics and Signal
Processing, Paris, France, 546-549.
Thanks again to everyone who replied,
James Magnuson
Department of Psychology
Columbia University
1190 Amsterdam Ave., MC 5501
New York City, New York 10027
magnuson at psych.columbia.edu
More information about the Corpora
mailing list