[Corpora-List] Automatic IPA transcription
Christian Pietsch
chr.pietsch at googlemail.com
Wed Jun 20 19:17:57 UTC 2012
Hi Sam,
I assume you have English text, not speech. Then what you need is a
grapheme-to-phoneme (G2P) converter. You will find them as components
of text-to-speech (TTS) systems. For English text, you could use
eSpeak or Festival, both of which are easily obtainable, e.g. as
Debian or Ubuntu Linux packages. Here is something I tried:
$ echo 'Will you pronounce this correctly?' | espeak -v en -x -q
--> wIl ju: pr at n'aUns DIs k at r'Ektli
The output you can see here is what eSpeak calls “phoneme mnemonics”,
but I guess it is X-SAMPA which is an ASCII representation of IPA. For
a mapping table and code in several programming languages, including
Python, see Henrik Theiling's IPA site <http://www.theiling.de/ipa/>.
Using his cxs.py module and CXS.def lookup table, I get this result:
--> wɪl juː prənˈaʊns ðɪs kərˈɛktli
Looks OK to me.
Instead of using parts of a full TTS system, you can also use
stand-alone G2P tools such as Sequitur G2P or Phonetisaurus, but you
might have to train them first.
Hope this helps,
Christian
On Tue, Jun 19, 2012 at 02:23:30PM -0400, Sam Raker wrote:
> I was wondering if anyone has found a good (OSX/*NIX-compatible)
> program for automatic transcription (of English) to IPA. There are a
> few websites that offer to do it, but I'd prefer something I could
> plug in to a python program, if possible.
--
Christian Pietsch
http://purl.org/net/pietsch
Bielefeld University, Bielefeld, Germany
University Library and CRC 882
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list