Arabic-L:LING:Automated transliteration method response

Dilworth Parkinson dilworth_parkinson at BYU.EDU
Thu Sep 28 23:08:17 UTC 2006


------------------------------------------------------------------------
Arabic-L: Thu 28 Aug 2006
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
            unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject:Automated transliteration method response
2) Subject:Automated transliteration method response

-------------------------Messages-----------------------------------
1)
Date: 28 Aug 2006
From:"Tim Buckwalter" <timbuckwalter at qamus.org>
Subject:Automated transliteration method response

Michael,

I assume you're attempting to convert Arabic script to something
resembling LC transliteration. There is a tool from Basis Technology
called Transliteration Assistant that does that for Arabic names:
http://www.basistech.com/transliteration-asst/

Tim Buckwalter
Philadelphia
www.qamus.org

------------------------------------------------------------------------ 
--
2)
Date: 28 Aug 2006
From:Ben Huyck <regexer at gmail.com>
Subject:Automated transliteration method response

Greetings Dr. Toler,

The problem you've presented here is interesting because it requires
the combination of what have historically been two separate technology
areas: transliteration and vocalization (or diacritization). I assume
that the book titles are already in electronic format (UTF-8 or
similar), and that they are not fully vocalized.

Thus, I will assume that your goal is to automatically do the
following conversion:

بين القصرين  --> Bayn al-Qasrayn (or some similarly readable  
transliteration)

To start, the best tool of which I am aware for transliteration is
Basis Technology's Name Translator
(http://www.basistech.com/name-translator/). The problem is that it
only fully transliterates names, probably using some form of
dictionary lookup. Any word/name that is not recognized by the tool is
transliterated just as it was encountered--vocalized or not. This
means that نجيب محفوظ would (probably) be successfully  
converted to
'Naguib Mahfouz', but the title بين القصرين would be  
rendered 'byn
alqSryn' (using Buckwalter's transliteration scheme). Any one-to-one
transliteration scheme has the benefit of being reversible to the
original Arabic, but it is hardly readable by a human.

In order to render human-readable titles, you would need to insert
vowels by means of a vocalizer/diacritizer. Sakhr Software has
developed a 'Diacritizer engine' that is utilized in many of their
tools. I'm not sure how one would go about getting it in standalone
form, but assuming you can, that would be perhaps one of the better
choices for this task. Their diacritizer would likely convert بين
القصرين to بَين القَصرَين, which would then yield  
the slightly more
readable 'bayn alqaSrayn'.

I suspect that at this point you would want to normalize this output
into a more accepted (and readable!) transliteration standard.
(http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style_(Arabic)/)
As far as this part, I'm not aware of any tool that does this really
well. If I had to do it, I would probably use some kind of regular
expression based perl script to generate the kind of transliteration
scheme I needed.

As you can see, most of these tasks are not trivial, and may require
some human supervision  to verify the entries. I hope this information
helps.

Please let me know if I can offer any clarification.

Cheers,
Ben Huyck


------------------------------------------------------------------------ 
--
End of Arabic-L:  28 Aug 2006
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20060928/d467fb8e/attachment.htm>


More information about the Arabic-l mailing list