[Corpora-List] transcribe English text

Briony Williams b.williams at bangor.ac.uk
Mon Mar 10 10:39:56 UTC 2008


Madiha Ijaz wrote:
> well i am not looking for any particular English accent as my purpose is to
> get phonetic transcription of English text and then convert that phonetic
> transcription to Urdu text.

Some of the vowels of UK English differ from US English vowels, and so to 
"get a phonetic transcription of English text", you will need to know which 
particular phonemic system is being used.  I have no idea about the different 
mappings of UK/US English to Urdu, but it's very possible that these major 
vowel differences will affect that mapping.  So you will in fact need to take 
into consideration the specific variety of English that is used.

> Basically i want to transliterate English text (misspelled words too) into
> Urdu and already existing algorithms do not give good
> results while converting English text to Urdu text directly. if i can get
> transcribed English text then at least letter-to-sound rules and sound
> change rules of English are taken care of and when the phonetic
> transcription is afterwards converted to Urdu text; the accuracy will be
> much higher.

As Kirk Baker wrote:

 > In that case it sounds like a lot of words won't be in the dictionary,
 > which is probably why people are suggesting something like festival. It
 > generates a pronunciation for words it doesn't know.
 >
 > One way to get the pronunciation from festival is:
 >
 > echo "(utt.relation.print (utt.synth (Utterance Text \"Pronounce this.\"))
 > \"Segment\")" | festival
 >
 > which can go in a script to process the English text.

This will give a lot of other information as well as the phoneme symbols. If 
you need just the phoneme symbols (and POS info), then the "lts.lookup" 
command within Festival is perhaps better.

> this is to be used in an Urdu text-to-speech system as while processing Urdu
> text one comes across a lot of English text which needs to be transliterated
> into Urdu in order to be processed by the TTS.
> i hope this clarifies your question.

But if it's *mis-spelled* English then how will you know that it's English?

Best wishes for the success of your Urdu TTS system.

Best regards

Briony Williams



-- 
Briony Williams

Arweinydd Tîm Technoleg Lleferydd / Speech Technology Team Leader
Uned Technolegau Iaith            / Language Technologies Unit
Adeilad Rhos, Safle'r Normal      / Rhos Building, Normal Site
Prifysgol Bangor                  / Bangor University
Bangor                            / Bangor
Gwynedd LL57 2PX, UK              / Gwynedd LL57 2PX, UK

E-Bost / E-Mail : b.williams at bangor.ac.uk
Gwe (Cymraeg)   : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.cy
Web (English)   : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.en
Ffôn / Tel      : +44 (0) 1506 200862
Rhithfro / Blog : http://murmur.bangor.ac.uk
....................................................................


-- 
Gall y neges e-bost hon, ac unrhyw atodiadau a anfonwyd gyda hi,
gynnwys deunydd cyfrinachol ac wedi eu bwriadu i'w defnyddio'n unig
gan y sawl y cawsant eu cyfeirio ato (atynt). Os ydych wedi derbyn y
neges e-bost hon trwy gamgymeriad, rhowch wybod i'r anfonwr ar
unwaith a dilëwch y neges. Os na fwriadwyd anfon y neges atoch chi,
rhaid i chi beidio â defnyddio, cadw neu ddatgelu unrhyw wybodaeth a
gynhwysir ynddi. Mae unrhyw farn neu safbwynt yn eiddo i'r sawl a'i
hanfonodd yn unig  ac nid yw o anghenraid yn cynrychioli barn
Prifysgol Bangor. Nid yw Prifysgol Bangor yn gwarantu
bod y neges e-bost hon neu unrhyw atodiadau yn rhydd rhag firysau neu
100% yn ddiogel. Oni bai fod hyn wedi ei ddatgan yn uniongyrchol yn
nhestun yr e-bost, nid bwriad y neges e-bost hon yw ffurfio contract
rhwymol - mae rhestr o lofnodwyr awdurdodedig ar gael o Swyddfa
Cyllid Prifysgol Bangor.  www.bangor.ac.uk

This email and any attachments may contain confidential material and
is solely for the use of the intended recipient(s).  If you have
received this email in error, please notify the sender immediately
and delete this email.  If you are not the intended recipient(s), you
must not use, retain or disclose any information contained in this
email.  Any views or opinions are solely those of the sender and do
not necessarily represent those of the Bangor University.
Bangor University does not guarantee that this email or
any attachments are free from viruses or 100% secure.  Unless
expressly stated in the body of the text of the email, this email is
not intended to form a binding contract - a list of authorised
signatories is available from the Bangor University Finance
Office.  www.bangor.ac.uk


_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list