[Corpora-List] pronunciation lexica (SUMMARY)

James Magnuson magnuson at paradox.psych.columbia.edu
Wed Sep 11 19:24:55 UTC 2002


Thank you to everyone who replied about my email last week regarding
pronuncations. Here is a summary of replies.

1. Regarding freely available lists of prounciations, there were two
   suggestions:
	- The CMU pronunciation dictionary (about 125k words of N.American
	   English):
see http://www.speech.cs.cmu.edu/cgi-bin/cmudict
	- The Moby lexicon (175k entries):
see http://www.speech.cs.cmu.edu/comp.speech/Section1/Lexical/moby.html

2. Three possibilities that require licensing of some sort:
	- Cambridge Dictionaries "English Pronouncing Dictionary" with
	  British and American pronunciations of 120k forms
see http://dictionary.cambridge.org/researchers.htm
	- Oxford University Press has various US English pronunciation
	  resources with over 165,000 headwords and over 255,000 wordforms
see  http://www.oup.co.uk/digital_reference
	- PRONLEX, a pronuncation lexicon developed as part of the
	  CALLHOME project, with 90,988 lexical entries
see http://www.ldc.upenn.edu/Catalog/LDC97L20.html

3. Regarding my mystery lexica:
	- Moby is, well, Moby (see #1)
	- MIT was developed from the 1964 Webster's Pocket dictionary.
	  There is a description in:

Shipman, D.W. and Zue, V.W (1982) "Properties of Large Lexicons:
Implications for Advanced Isolated Word Recognition Systems", Conference
Record, IEEE International Conference on Speech Acoustics and Signal
Processing, Paris, France, 546-549.

Thanks again to everyone who replied,

jim

--------------------------------------------------------
James Magnuson
Department of Psychology
Columbia University
1190 Amsterdam Ave., MC 5501
New York City, New York  10027
(212)854-5667
magnuson at psych.columbia.edu



More information about the Corpora mailing list