[Corpora-List] transcribe English text

Madiha Ijaz madiha.ijaz at nu.edu.pk
Thu Mar 6 11:51:02 UTC 2008


Hi Briony,

well i am not looking for any particular English accent as my purpose is to
get phonetic transcription of English text and then convert that phonetic
transcription to Urdu text.
Basically i want to transliterate English text (misspelled words too) into
Urdu and already existing algorithms do not give good
results while converting English text to Urdu text directly. if i can get
transcribed English text then at least letter-to-sound rules and sound
change rules of English are taken care of and when the phonetic
transcription is afterwards converted to Urdu text; the accuracy will be
much higher.
this is to be used in an Urdu text-to-speech system as while processing Urdu
text one comes across a lot of English text which needs to be transliterated
into Urdu in order to be processed by the TTS.
i hope this clarifies your question.

Madiha


On 3/6/08, Briony Williams <b.williams at bangor.ac.uk> wrote:
>
> Alexander S. Yeh wrote:
> > Madiha Ijaz wrote:
> >> i wanted to know if there is any open source tool available that can
> >> transcribe English text in either IPA or SAMPA.
> >
> > Some indirect paths:
> >
> > 1. The CMU pronouncing dictionary maps many English words to one or more
> > pronunciations. The pronunciations are coded with combinations of ASCII
> > characters. Each combination maps to one (sometimes more) IPA symbols.
>
> Are you looking for a US or a UK pronunciation of English?
>
> For a US accent, the CMU dictionary (as Alexander wrote) is a good tool.
>
> > 2. The Festival text-to-speech system uses the CMU pronouncing
> > dictionary and other resources to make a best guess at how a word is
> > pronounced. Initially, the pronunciation is encoded in the same way as
> > in the CMU pronouncing dictionary.
> > Some people have gotten Festival to output these pronunciation codes.
>
> This is quite easy to do within Festival. Or just grep the dictionary,
> perhaps?
>
> On the other hand, if a UK English pronunciation is preferred, you could
> use
> the OALD which is part of the Festival distribution (bear in mind the
> licence
> restricts its use to non-commercial purposes only). Or alternatively use
> the
> pronunciation lookup feature within SFS (Speech Filing System), available
> at
> http://www.phon.ucl.ac.uk/resource/sfs/
>
> I hope this helps.
>
> Best regards
>
> Briony Williams
>
> --
> Briony Williams
>
> Arweinydd Tîm Technoleg Lleferydd / Speech Technology Team Leader
> Uned Technolegau Iaith            / Language Technologies Unit
> Adeilad Rhos, Safle'r Normal      / Rhos Building, Normal Site
> Prifysgol Bangor                  / Bangor University
> Bangor                            / Bangor
> Gwynedd LL57 2PX, UK              / Gwynedd LL57 2PX, UK
>
> E-Bost / E-Mail : b.williams at bangor.ac.uk
> Gwe (Cymraeg)   : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.cy
> Web (English)   : http://www.bangor.ac.uk/ar/cb/technolegau_iaith.php.en
> Ffôn / Tel      : +44 (0) 1506 200862
> Rhithfro / Blog : http://murmur.bangor.ac.uk
> ....................................................................
>
>
> --
> Gall y neges e-bost hon, ac unrhyw atodiadau a anfonwyd gyda hi,
> gynnwys deunydd cyfrinachol ac wedi eu bwriadu i'w defnyddio'n unig
> gan y sawl y cawsant eu cyfeirio ato (atynt). Os ydych wedi derbyn y
> neges e-bost hon trwy gamgymeriad, rhowch wybod i'r anfonwr ar
> unwaith a dilëwch y neges. Os na fwriadwyd anfon y neges atoch chi,
> rhaid i chi beidio â defnyddio, cadw neu ddatgelu unrhyw wybodaeth a
> gynhwysir ynddi. Mae unrhyw farn neu safbwynt yn eiddo i'r sawl a'i
> hanfonodd yn unig  ac nid yw o anghenraid yn cynrychioli barn
> Prifysgol Bangor. Nid yw Prifysgol Bangor yn gwarantu
> bod y neges e-bost hon neu unrhyw atodiadau yn rhydd rhag firysau neu
> 100% yn ddiogel. Oni bai fod hyn wedi ei ddatgan yn uniongyrchol yn
> nhestun yr e-bost, nid bwriad y neges e-bost hon yw ffurfio contract
> rhwymol - mae rhestr o lofnodwyr awdurdodedig ar gael o Swyddfa
> Cyllid Prifysgol Bangor.  www.bangor.ac.uk
>
> This email and any attachments may contain confidential material and
> is solely for the use of the intended recipient(s).  If you have
> received this email in error, please notify the sender immediately
> and delete this email.  If you are not the intended recipient(s), you
> must not use, retain or disclose any information contained in this
> email.  Any views or opinions are solely those of the sender and do
> not necessarily represent those of the Bangor University.
> Bangor University does not guarantee that this email or
> any attachments are free from viruses or 100% secure.  Unless
> expressly stated in the body of the text of the email, this email is
> not intended to form a binding contract - a list of authorised
> signatories is available from the Bangor University Finance
> Office.  www.bangor.ac.uk
>
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080306/224db33e/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list