22.2903, Qs: Initial Training for Speech Recognition Software

Fri Jul 15 13:57:13 UTC 2011

LINGUIST List: Vol-22-2903. Fri Jul 15 2011. ISSN: 1068 - 4875.

Subject: 22.2903, Qs: Initial Training for Speech Recognition Software

Moderators: Anthony Aristar, Eastern Michigan U <aristar at linguistlist.org>
            Helen Aristar-Dry, Eastern Michigan U <hdry at linguistlist.org>

Reviews: Veronika Drake, U of Wisconsin-Madison  
Monica Macaulay, U of Wisconsin-Madison  
Rajiv Rao, U of Wisconsin-Madison  
Joseph Salmons, U of Wisconsin-Madison  
Anja Wanner, U of Wisconsin-Madison  
       <reviews at linguistlist.org> 

Homepage: http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, 
and donations from subscribers and publishers.

Editor for this issue: Brent Woo <bwoo at linguistlist.org>
================================================================  

We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then strongly encouraged to post a summary to the list. This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.

In addition to posting a summary, we'd like to remind people that it
is usually a good idea to personally thank those individuals who have
taken the trouble to respond to the query.

To post to LINGUIST, use our convenient web form at
http://linguistlist.org/LL/posttolinguist.cfm.

===========================Directory==============================  

1)
Date: 13-Jul-2011
From: Anna Haberko [ahaberko at gmail.com]
Subject: Initial Training for Speech Recognition Software

-------------------------Message 1 ---------------------------------- 
Date: Fri, 15 Jul 2011 09:56:12
From: Anna Haberko [ahaberko at gmail.com]
Subject: Initial Training for Speech Recognition Software

E-mail this message to a friend:
http://linguistlist.org/issues/emailmessage/verification.cfm?iss=22-2903.html&submissionid=4526501&topicid=8&msgnumber=1

My company is developing software for doctors to dictate reports. Our
software relies on a speech recognition engine that is trained to 
recognize words.  To improve on the current model, I am redesigning 
the initial speech training component.  As I would like to develop 
effective material, I am looking for insight on the following questions:

What are the requirements for initial speech training text (to be read by
the user of speech recognition in order to initially train the speech
engine, and start working with a satisfactory level of recognition)?  
Does it have to include all possible phonemes of a language? 
Do they have to repeat certain number of times? 
If the full phonemic inventory is not required, what would be necessary 
for a language such as English?  
What other requirements should I consider for such a text?

While I have attempted to do some research on this subject, I have had
trouble finding adequate guidelines for this, and speech corpora have 
not really been searchable for texts like this.  I have an exemplary text 
of SpeechMagic software (provided by Nuance), but I would be grateful 
for any additional examples people could provide.  Any other resources 
or guidelines for speech recognition development would also be greatly
appreciated. 

Linguistic Field(s): Computational Linguistics

-----------------------------------------------------------
LINGUIST List: Vol-22-2903	
----------------------------------------------------------