<html><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; ">Hi Madiha,<div><br></div><div>The freetts project has a Java class which uses a simple state machine mechanism based on a datafile that creates pronunciations for OOV words using the CMU format. The class is easy to run standalone outside the freetts project using the datafile.... the logic is based on a paper which is referenced in the Java class docs, not sure of the details.</div><div><br></div><div>I'm using it for OOV that fall outside the standard CMU pronunciation dictionary... and it works fairly well.</div><div><br></div><div><a href="http://freetts.sourceforge.net/javadoc/com/sun/speech/freetts/lexicon/LetterToSoundImpl.html">http://freetts.sourceforge.net/javadoc/com/sun/speech/freetts/lexicon/LetterToSoundImpl.html</a></div><div><br></div><div>Hope this helps.</div><div><br></div><div>d</div><div><div><html>On 24-Apr-08, at 10:56 PM, Madiha Ijaz wrote:</html><br class="Apple-interchange-newline"><blockquote type="cite"><div>Dear all,</div> <div> </div> <div><div style="background-image: initial; background-repeat: initial; background-attachment: initial; -webkit-background-clip: initial; -webkit-background-origin: initial; background-color: white; margin-top: 0in; margin-right: 0in; margin-bottom: 0pt; margin-left: 0in; direction: ltr; text-align: left; background-position: initial initial; "><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial">couple of days back i put a query regarding transcribing English text into Urdu and in response received some worthwhile suggestions.</span></div><div style="background-image: initial; background-repeat: initial; background-attachment: initial; -webkit-background-clip: initial; -webkit-background-origin: initial; background-color: white; margin-top: 0in; margin-right: 0in; margin-bottom: 0pt; margin-left: 0in; direction: ltr; text-align: left; background-position: initial initial; "><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial">the one on which i am working right now makes use of CMU pronunciation dictionary and it is working fine but OOV still remain a problem. one possible solution is to train neural nets or HMM on CMU pronunciation dictionary which later on can be used to predict pronunciation of <span> </span>unknown words. so </span><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial">i wanted to know if any related exercise has been done in this regard or not?</span></div><p style="BACKGROUND: white; MARGIN: 0in 0in 0pt; DIRECTION: ltr; TEXT-ALIGN: left"><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial"></span> </p><div style="background-image: initial; background-repeat: initial; background-attachment: initial; -webkit-background-clip: initial; -webkit-background-origin: initial; background-color: white; margin-top: 0in; margin-right: 0in; margin-bottom: 0pt; margin-left: 0in; direction: ltr; text-align: left; background-position: initial initial; "><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial">secondly does any pronunciation dictionary (English) exist that provides syllabified word transcription instead of just providing transcription or any tool that syllabifies English text?</span></div><p style="BACKGROUND: white; MARGIN: 0in 0in 0pt; DIRECTION: ltr; TEXT-ALIGN: left"><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial"></span> </p><div style="background-image: initial; background-repeat: initial; background-attachment: initial; -webkit-background-clip: initial; -webkit-background-origin: initial; background-color: white; margin-top: 0in; margin-right: 0in; margin-bottom: 0pt; margin-left: 0in; direction: ltr; text-align: left; background-position: initial initial; "><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial">regards</span></div><div style="background-image: initial; background-repeat: initial; background-attachment: initial; -webkit-background-clip: initial; -webkit-background-origin: initial; background-color: white; margin-top: 0in; margin-right: 0in; margin-bottom: 0pt; margin-left: 0in; direction: ltr; text-align: left; background-position: initial initial; "><span style="FONT-SIZE: 9.5pt; FONT-FAMILY: Arial">Madiha</span></div></div> <div> </div> _______________________________________________<br>Corpora mailing list<br><a href="mailto:Corpora@uib.no">Corpora@uib.no</a><br>http://mailman.uib.no/listinfo/corpora<br></blockquote></div><br><div apple-content-edited="true"> <span class="Apple-style-span" style="border-collapse: separate; border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; text-align: auto; -khtml-text-decorations-in-effect: none; text-indent: 0px; -apple-text-size-adjust: auto; text-transform: none; orphans: 2; white-space: normal; widows: 2; word-spacing: 0px; "><div style="word-wrap: break-word; -khtml-nbsp-mode: space; -khtml-line-break: after-white-space; "><div>David Ayre</div><div><a href="mailto:dave@ayre.ca">dave@ayre.ca</a></div><div><a href="http://www.gtrlabs.org">http://www.gtrlabs.org</a></div><div><a href="http://www.linguity.com">http://www.linguity.com</a></div><div><br class="khtml-block-placeholder"></div></div><br class="Apple-interchange-newline"></span> </div><br></div></body></html>