[Corpora-List] International Phonetic Alphabet transcription tool / software

WHITELOCK, Pete pete.whitelock at oup.com
Wed Mar 6 15:28:29 UTC 2013


Isn't everyone missing the point here. It's not about ASR; Mario asked about automatic analysis of spectrograms, a problem in visual pattern extraction.

Pete Whitelock, PhD
Principal Language Engineer, Technology
Academic Dictionaries
Oxford University Press

From: corpora-bounces at uib.no [mailto:corpora-bounces at uib.no] On Behalf Of Jim Fidelholtz
Sent: 06 March 2013 15:18
To: Matías Guzmán
Cc: Mario Crespo Miguel; corpora-request at uib.no; corpora at uib.no
Subject: Re: [Corpora-List] International Phonetic Alphabet transcription tool / software

Hi, Mario,

My advice is to take Matías' comments very seriously. Over 40 years ago, I was involved with the US government group which funded most American research in speech recognition. Many very bright researchers were involved in this research. Unfortunately, despite the fact that even then (about 1970 and just earlier) there were 'effective' systems trainable on small vocabularies (eg, numbers 0-9 for the Post Office) and individual speakers, the field was populated by many charlatans/snake-oil salesman types, and believe me, I use those terms advisedly. Currently, although I am very aware of the multiple orders of magnitude increases in computer speed, capacity and general gee-whiz factor, I am still not convinced of the general applicability of such systems, and of course even less convinced of any linguistic interest in such systems, even for general phone answering, etc., and of course (unfortunately) much less for applications like what you are asking about.

The usual disclaimers apply, and you are unlikely to encounter anyone quite as negative as I am about speech recognition, and I'm sure the field still contains many brilliant, ethical researchers, and I have not been following publications in the field in recent decades. Nevertheless, I stand by my opinions, based on well-informed experience, even though it was decades ago. Example: my current experience with automatic speech-recognition systems is that, more often than not, I end up being transferred to a human operator, and I speak Midwestern American English, which should be relatively easy for these systems to recognize, from a linguistically-informed standpoint. Sorry to be so negative, but I'd bet lots of money that you won't be successful in finding anything which would be very satisfying, even if you have program-tweaking powers. I would say that if you have lowered expectations and an extremely liberal attitude, you *might* find something which could give you some minimum help in your research. Perhaps it would be better to be (or use/hire) an experienced dialectologist, design very intelligent experiments which could get useful results from minimal recordings, and do your own transcriptions. I know that's a tall order. Good luck.

Jim

On Wed, Mar 6, 2013 at 8:45 AM, Matías Guzmán <mortem.dei at gmail.com<mailto:mortem.dei at gmail.com>> wrote:
Mario, I doubt there is anything that can do what you want. As I understand it, speech recognition systems depend to a great degree of a language model that predicts what word could come next, and then try to match what the speaker said to a database. I don't see how a program could transcribe for you if the /t/ is dental or alveolar, I don't even think speech recognition software do a decent job working on single syllables. But I'm not an expert, maybe someone can correct me.

Regards, Matías

2013/3/6 Mario Crespo Miguel <mario.crespo at uca.es<mailto:mario.crespo at uca.es>>

Dear members of corporalist,

In the research of phonetic dialectology it is extremely important to be able to differentiate between the different sounds pronounced by the subjects being studied and be able to transcribe them according to linguistic criteria. This task can be extremely exhausting and subjective for the researcher when carried out without any help.

I wonder if you know a tool or software able to represent and analyze spectrograms of sounds pronounced and transcribe such sounds into the International Phonetic Alphabet (IPA).

Thank you very much in advance,

Mario Crespo

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no<mailto:Corpora at uib.no>
http://mailman.uib.no/listinfo/corpora


_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no<mailto:Corpora at uib.no>
http://mailman.uib.no/listinfo/corpora



--
James L. Fidelholtz
Posgrado en Ciencias del Lenguaje
Instituto de Ciencias Sociales y Humanidades
Benemérita Universidad Autónoma de Puebla, MÉXICO

Oxford University Press (UK) Disclaimer

This message is confidential. You should not copy it or disclose its contents to anyone. You may use and apply the information for the intended purpose only. OUP does not accept legal responsibility for the contents of this message. Any views or opinions presented are those of the author only and not of OUP. If this email has come to you in error, please delete it, along with any attachments. Please note that OUP may intercept incoming and outgoing email communications.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20130306/e10eb542/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list