6.1645, Sum: Machine Readable Dictionaries

Wed Nov 22 16:14:36 UTC 1995

---------------------------------------------------------------------------
LINGUIST List:  Vol-6-1645. Wed Nov 22 1995. ISSN: 1068-4875. Lines:  84

Subject: 6.1645, Sum: Machine Readable Dictionaries

Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at tam2000.tamu.edu>
            Helen Dry: Eastern Michigan U. <hdry at emunix.emich.edu>

Associate Editor:  Ljuba Veselinova <lveselin at emunix.emich.edu>
Assistant Editors: Ron Reck <rreck at emunix.emich.edu>
                   Ann Dizdar <dizdar at tam2000.tamu.edu>
                   Annemarie Valdez <avaldez at emunix.emich.edu>

Software development: John H. Remmers <remmers at emunix.emich.edu>

Editor for this issue: dseely at emunix.emich.edu (T. Daniel Seely)

---------------------------------Directory-----------------------------------
1)
Date:  Tue, 21 Nov 1995 18:50:34 EST
From:  PATERJV at QUCDN.QueensU.CA (Joe Pater)
Subject:       Machine Readable Dictionaries: Summary

---------------------------------Messages------------------------------------
1)
Date:  Tue, 21 Nov 1995 18:50:34 EST
From:  PATERJV at QUCDN.QueensU.CA (Joe Pater)
Subject:       Machine Readable Dictionaries: Summary

Thanks to all who answered my query about machine readable
pronunciation dictionaries:
David Powers, Patrick Juola, Pete Whitelock, John Coleman,
Francois Yvon, Bruce Nevin, Richard Shillcock, Tony Vitale,
Jean-Louis Duchet, and Peter Daniels.

The answers contained many useful suggestions on paths I might
take to obtain copyright-free English pronunciation dictionaries:
most of these I have yet to follow up on, but I will post any further
results that might be of use.

By far the most oft-cited source was the Oxford Text Archive,
which contains copies of several machine-readable dictionaries.
The WWW URL is:  http://info.ox.ac.uk/~archive/ota.html
The Oxford Text Archive Shortlist, which gives up to date brief
details of all texts held in the Archive can be obtained through e-mail
to ARCHIVE at VAX.OXFORD.AC.UK, or by anonymous FTP
from the directory ota on ota.ox.ac.uk

Other Sources:

Patrick Juola writes:
"Grady Ward has a MOBY Pronunciator list available. About
80,000 words of English with an approximate IPA pronunciation
listed.
Research licence is about US $200, I think the commercial is double
that.  Get it from grady at netcom.com"

Francois Yvon mentions:
"[o]ne [that] has been used in the NetTalk experiments and is
available. You can get more info on that corpus (including ftp
address) by mailing to:
neural-bench at cs.cmu.edu"

As well as one available via ftp from:

host:      ftp.cs.cmu.edu [128.2.206.173]
directory: project/fgdata/dict
  Retrieve the following files:

     README
     cmudict.0.2.Z (compressed)
     cmulex.0.1.Z (compressed)
     phoneset.0.1

Any other suggestions are still welcome.

Joe Pater,
paterjv at qucdn.queensu.ca

P.S. For information on CHILDES, go to http://poppy.psy.cmu.edu
or consult MacWhinney, B. (1995) The Childes Project (2nd ed.), LEA.
------------------------------------------------------------------------
LINGUIST List: Vol-6-1645.