[Corpora-List] transcribe English text

Dr DJ Hatch drdjhatch at gmail.com
Mon Mar 10 15:55:06 UTC 2008


But, of course, the UK/US differences are only a part of the story. There
are differences between the South East and the North of England, and South
East and the South West (and Wales?). These differences are particularly
noticeable re words such grass and bus (South East/North). This without even
mentioning Australia, ...

However, one thing that still seems to be believed is that the Brits and US
citizens differ in their pronunciation of potato. The moral, obviously, is
stop listening to Frank Sinatra.

On 10/3/08 11:39, "Briony Williams" <b.williams at bangor.ac.uk> wrote:

> Madiha Ijaz wrote:
>> well i am not looking for any particular English accent as my purpose is to
>> get phonetic transcription of English text and then convert that phonetic
>> transcription to Urdu text.
> 
> Some of the vowels of UK English differ from US English vowels, and so to
> "get a phonetic transcription of English text", you will need to know which
> particular phonemic system is being used.  I have no idea about the different
> mappings of UK/US English to Urdu, but it's very possible that these major
> vowel differences will affect that mapping.  So you will in fact need to take
> into consideration the specific variety of English that is used.
> 
>> Basically i want to transliterate English text (misspelled words too) into
>> Urdu and already existing algorithms do not give good
>> results while converting English text to Urdu text directly. if i can get
>> transcribed English text then at least letter-to-sound rules and sound
>> change rules of English are taken care of and when the phonetic
>> transcription is afterwards converted to Urdu text; the accuracy will be
>> much higher.
> 
> As Kirk Baker wrote:
> 
>> In that case it sounds like a lot of words won't be in the dictionary,
>> which is probably why people are suggesting something like festival. It
>> generates a pronunciation for words it doesn't know.
>> 
>> One way to get the pronunciation from festival is:
>> 
>> echo "(utt.relation.print (utt.synth (Utterance Text \"Pronounce this.\"))
>> \"Segment\")" | festival
>> 
>> which can go in a script to process the English text.
> 
> This will give a lot of other information as well as the phoneme symbols. If
> you need just the phoneme symbols (and POS info), then the "lts.lookup"
> command within Festival is perhaps better.
> 
>> this is to be used in an Urdu text-to-speech system as while processing Urdu
>> text one comes across a lot of English text which needs to be transliterated
>> into Urdu in order to be processed by the TTS.
>> i hope this clarifies your question.
> 
> But if it's *mis-spelled* English then how will you know that it's English?
> 
> Best wishes for the success of your Urdu TTS system.
> 
> Best regards
> 
> Briony Williams
> 
> 



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list