Arabic-L:LING:Indo-European Language Transcription Database-Request for Collaboration

Dilworth Parkinson dil at BYU.EDU
Fri Oct 1 15:05:44 UTC 2010


------------------------------------------------------------------------
Arabic-L: Fri 01 Oct 2010
Moderator: Dilworth Parkinson <dil at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
            unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject: Indo-European Language Transcription Database-Request for Collaboration

-------------------------Messages-----------------------------------
1)
Date: 01 Oct 2010
From: Joel Shapiro <jrs_14618 at yahoo.com>
Subject: Indo-European Language Transcription Database-Request for Collaboration

Indo-European Language Transcription Database - Request For Collaboration/Help To Create It
 
Assalamu Alaikum, Salaam, Hello All,
 
My name is Joel Shapiro.  I am a long time subscriber to the Arabic-L forum and infrequent poster
to it.
 
In the past few years I have developed some unique language tools; utilities and applications;
transcription veracity verification currently only "tooled" for Arabic and most recently and
my current focus a utility to search in all native language character sets or alphabets present
on the Internet to an unprecedented degree of precision, accuracy, flexibility and
"seamlessness". 
 
My forte is Semitic languages and to a lesser extent but the same genre or "ballpark" Indo-European
(i.e. -as you know- Pashto, Farsi/Dari, Urdu, Khowar etc.)
 
Following is a note I posted to one Urdu speaking source but I have not received any further
response.
 
To date I've found it very hard to find it very hard to find and then contact Indo-European speakers
or experts and what English <---> Indo-European dictionaries or translations listings that I've found
online are primarily "regular" words and proportionally very few phonetic transcriptions which is my
objective.
 
Google indeed does have an impressive "transcription engine" for a host of what I personally term
"estoteric languages";  those which are just beginning to have a substantial presence on the Internet:
 
http://www.google.com/transliterate
 
One of the language options is Persian (i.e. Farsi/Dari) which of course uses the extended Arabic
character set, but not to the extent of Pashto and Urdu where instances of "Arabic characters with
rings" become prevalent.  I think of it as an "extended extended" Arabic character set use.
 
Those of you reading this versed in Pashto and/or Urdu know or recognize exactly what I'm
referring to here.
 
My search utility is especially "tuned" or my programming infrastructure is all in place for such
fine, subtle distinctions.
 
You have my word I will act as a clearinghouse for your verified, attested transcriptions and
I intend to post the (hopefully) ongoing, ever-growing database from all contributors that I
strongly contend is useful in own right for (your) manual native text searches ... sans my
utility.
 
I vigorously contend robust transcription databases can be tremendously helpful in many
circumstances and contexts although at first glance it may not appear so.

Sta na shukria, Tashakkur, Shukriya, Thank you all for your interest and consideration.
 
Joel S.
 
 
 
Assalamu Alaikum *******: Perhaps my work can help your fellow countrymen ...

I just left a voicemail to you where I stated I was going to send you an E-mail describing my
specialized multilingual work that may be useful for helping out in the dire situation in Pakistan
for gleaning information, finding where it is lacking and perhaps most important of all correcting
things that are incorrect.  e.g. medical information, dosages etc.

My forte is the subtleties and nuances of phonetics and transcriptions of Semitic and Indo-European
languages which directly applies to the very high precision searches and "Internet scraping" I can
perform with the Python programming application or utility I have developed ... completely on my own.

It functions better than I ever anticipated it would when I started working on it for almost a year now.
 
Where my conversational expertise is Hebrew I'd say on an intermediate level, I am very familiar with
hundreds of Arabic words and many basic sayings, greetings etc.  However, from a spelling respect,
I feel confident declaring my expertise is on on par or in many contexts native Arabic and Farsi
speakers.

My search utility is a "local" application (i.e. it just resides on my computer).  The background
programming for it is far too complex to convert it to a web page as simply web tools don't have
the comprehensive esoteric "ingredients" I need as does Python.

Without going into detail and specifics I have thrown the "kitchen sink" into my application where
I understand Farsi and Pashto uses an extended Arabic character set and very subtle differences
in spelling and characters used within Arabic and Farsi and subtle differences between the
languages make all the difference in the world.

You can see a good example of my work on my Arabic transcription veracity verification web page:

http://enartrans.com
http://enartrans.com/transcription

and my datasets; the end result of my utility processing:

http://infochimps.org/datasets/arabic-internet-footprint-of-the-brookings-institute-middle-east

Attached are a couple posts I made to a couple other sources in the same regard or genre.

My inclination or objective is to perhaps glean information unlike anyone else that could be
instrumental in saving some lives but here with the flooding the scope is orders of magnitude
greater.

All the specialized language programming and experience I have is directly applicable.  I could
start doing specialized searches with regard to this humanitarian crisis straight away.

My thinking is I would/could devote some extensive searching for free; gratis with the caveat it
would reference my work where it would be on a for pay or fee basis.  I think it would be great
free advertising all the while doing something beneficial.

I was wondering if you have been approached in light of this crisis for your lingual and intellectual
abilities and if you have any ideas, references or suggestions for me.

From the work I have developed I really feel I have become apolitical.  I just want to make it a
better place and provide for my family like anybody else.  If anything else I just want to extend
to you my sympathy and condolences to you for people you knew that have died or are in
dire straits.

I welcome and look forward to hearing from you.

-Salam,

Joel S.

Joel Shapiro
Rochester, New York 14618
(585) 255-0997 (Cell - Call anytime - best to reach me)
(585) 473-7013 (Home - 9:30 to 22:00 EDT/EST)

jrs_14618 at yahoo.com
-or-
cshapiro at rochester.rr.com
 
http://enartrans.com/

--------------------------------------------------------------------------
End of Arabic-L: 01 Oct 2010
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20101001/bc9c51b9/attachment.htm>


More information about the Arabic-l mailing list