[Corpora-List] pronunciation lexica (SUMMARY)
James Magnuson
magnuson at paradox.psych.columbia.edu
Wed Sep 11 19:24:55 UTC 2002
Thank you to everyone who replied about my email last week regarding
pronuncations. Here is a summary of replies.
1. Regarding freely available lists of prounciations, there were two
suggestions:
- The CMU pronunciation dictionary (about 125k words of N.American
English):
see http://www.speech.cs.cmu.edu/cgi-bin/cmudict
- The Moby lexicon (175k entries):
see http://www.speech.cs.cmu.edu/comp.speech/Section1/Lexical/moby.html
2. Three possibilities that require licensing of some sort:
- Cambridge Dictionaries "English Pronouncing Dictionary" with
British and American pronunciations of 120k forms
see http://dictionary.cambridge.org/researchers.htm
- Oxford University Press has various US English pronunciation
resources with over 165,000 headwords and over 255,000 wordforms
see http://www.oup.co.uk/digital_reference
- PRONLEX, a pronuncation lexicon developed as part of the
CALLHOME project, with 90,988 lexical entries
see http://www.ldc.upenn.edu/Catalog/LDC97L20.html
3. Regarding my mystery lexica:
- Moby is, well, Moby (see #1)
- MIT was developed from the 1964 Webster's Pocket dictionary.
There is a description in:
Shipman, D.W. and Zue, V.W (1982) "Properties of Large Lexicons:
Implications for Advanced Isolated Word Recognition Systems", Conference
Record, IEEE International Conference on Speech Acoustics and Signal
Processing, Paris, France, 546-549.
Thanks again to everyone who replied,
jim
--------------------------------------------------------
James Magnuson
Department of Psychology
Columbia University
1190 Amsterdam Ave., MC 5501
New York City, New York 10027
(212)854-5667
magnuson at psych.columbia.edu
More information about the Corpora
mailing list