IPA converter

Chris Waigl chris at LASCRIBE.NET
Thu Mar 12 23:53:55 UTC 2009

On 12 Mar 2009, at 04:45, Herb Stahlke wrote:
> Hopelessly unreliable.  I tried it on a number of words and phrases.
> Some it doesn't convert at all, some it converts in almost a random
> way, like orthographic <h> replaced by [Q].  It didn't recognized the
> SIL IPA 93 font in my font library, even though that's the font it
> asks for.  It couldn't transcribe "caught."  It comes up with
> unexplained symbols like a double >.  It does, however, convert
> "little" with a syllabic /l/.

I've been putting together an embryonic IPA converter, which currently
lives here here: http://ipalizer.appspot.com/ .

Right now, all it does is to transform Merriam-Webster style phonetics
into IPA. So in order to use it, you need to:

*Access the MW page for a word (say: http://www.merriam-webster.com/dictionary/friday)
* Copy the phonetic transcription into your clipboard
* Paste it into the tool's text field and hit submit

There are two major limitations:

1. It can't distinguish between [T] (as in "thief") and [D] (as in
"this"). The reason is that whoever spec'ed the MW representation of
phonetic characters had the extraordinarily bright idea of using the
<u> element in the HTML markup to realize underlining, which
distinguishes the two phonemes in their version of phonetic
transcription. Markup-level underlining does not copy and paste.
2. This is not really at a publishable level of completion. Way pre-
alpha. While I'd be delighted about any feedback, please get in touch
if you want to use it for pretty much anything beyond playing around.

The next thing I want to do is to replace the input field with a field
asking for the word to transcribe, then retrieve the MW page myself,
scrape out the phonetics, and then transpose those to IPA. Also, the
same could be done for AHD4 (as per bartleby.com), but they use even
more markup, which complicated matters.

Chris Waigl
who is still very unhappy with the state of phonetic transcription in
English online dictionaries (the OED *still* uses small images for
some characters! do they need someone helping them out with Unicode?),
and amused about MW's choice of class names: <dd class="pron"><span

Chris Waigl -- http://chryss.eu -- http://eggcorns.lascribe.net
twitter: chrys -- friendfeed: chryss

The American Dialect Society - http://www.americandialect.org

More information about the Ads-l mailing list