<HTML><BODY style="word-wrap: break-word; -khtml-nbsp-mode: space; -khtml-line-break: after-white-space; "><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">------------------------------------------------------------------------</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Arabic-L: Wed 15 Nov 2006</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Moderator: Dilworth Parkinson <<A href="mailto:dilworth_parkinson@byu.edu">dilworth_parkinson@byu.edu</A>></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">[To post messages to the list, send them to <A href="mailto:arabic-l@byu.edu">arabic-l@byu.edu</A>]</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">[To unsubscribe, send message from same address you subscribed from to</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><A href="mailto:listserv@byu.edu">listserv@byu.edu</A> with first line reading:</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "> unsubscribe arabic-l ]</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">-------------------------Directory------------------------------------</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">1) Subject:Arabic OCR response</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">2) Subject:Arabic OCR response</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">3) Subject:Arabic OCR response</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">-------------------------Messages-----------------------------------</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">1)</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Date: 15 Nov 2006</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">From:<B> </B>"Ben Huyck" <<A href="mailto:regexer@gmail.com">regexer@gmail.com</A>></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Subject:rabic OCR response</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Dear Waheed,</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">As far as I am aware, there are really only two viable resources for Arabic OCR. They are Sakhr's Automatic Reader, and NovoDynamics' VERUS. They both perform well, although I personally have more experience with Sahkr. </DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">[urls]</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><FONT class="Apple-style-span" size="2"><SPAN class="Apple-style-span" style="font-size: 10px;">www.</SPAN></FONT><FONT class="Apple-style-span" size="2"><SPAN class="Apple-style-span" style="font-size: 10px;"><B>sakhr</B></SPAN></FONT><FONT class="Apple-style-span" size="2"><SPAN class="Apple-style-span" style="font-size: 10px;">.com</SPAN></FONT></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><FONT class="Apple-style-span" size="2"><SPAN class="Apple-style-span" style="font-size: 10px;">www.</SPAN></FONT><FONT class="Apple-style-span" size="2"><SPAN class="Apple-style-span" style="font-size: 10px;"><B>novodynamics</B></SPAN></FONT><FONT class="Apple-style-span" size="2"><SPAN class="Apple-style-span" style="font-size: 10px;">.com</SPAN></FONT></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I know Sakhr works with MS Word, and I am fairly certain that VERUS does as well. Both are reasonably fast (OCR tools usually are), and both are capable of batch processing. </DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">As far as accuracy goes, you'll have to be the judge of how "clean" your documents are. Both tools will degrade as the quality of input images goes down, but one of NovoDynamics' main selling points is that it performs well on degraded images. I'm not aware of any independent evaluation of this claim. </DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Unfortunately, quirk-free software is not generally available in a field as new as Arabic OCR, as the tools have not developed a large enough user base to fully mature. You'll find that given the type of data processed you will get varying results. </DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Of course, all of this information is nearly useless if you don't take steps during the scanning process to ensure that you are preparing the electronic images correctly for OCR. This ranges from the most obvious characteristics such as resolution (anything under 200dpi is generally useless, 300 is optimal for most OCR tools), to seemingly insignificant settings such as the default contrast. </DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I co-authored a paper on this very topic. If you're interested, you can get it at the following url.</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><A href="http://www.mitre.org/work/tech_papers/tech_papers_05/05_0150/index.html">http://www.mitre.org/work/tech_papers/tech_papers_05/05_0150/index.html </A></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">All the best,</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Ben Huyck</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><BR class="khtml-block-placeholder"></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">--------------------------------------------------------------------------</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">2)</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Date: 15 Nov 2006</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">From:<B> <SPAN class="Apple-style-span" style="font-weight: normal;">Jan Hoogland <<A href="mailto:j.hoogland@let.ru.nl">j.hoogland@let.ru.nl</A>></SPAN></B></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Subject:rabic OCR response</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Hi Waheed,</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I heared ReadIris, that comes with Agfa scanners and maybe with other brands</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">as well, performed well on Arabic. So no expensive extra software, but</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">simply the OCR programme that comes with the scanner.</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I myself haven't been using OCR for ages. In the nineties there was Al Qari'</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Al Ali, but you're closer to SAkhr than I am to ask them about its</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">developments since then.</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Regards,</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Jan</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR class="khtml-block-placeholder"></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">--------------------------------------------------------------------------</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">3)</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Date: 15 Nov 2006</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">From:<B> <SPAN class="Apple-style-span" style="font-weight: normal;">"al-Husein N. Madhany" <<A href="mailto:anm@post.harvard.edu">anm@post.harvard.edu</A>></SPAN></B></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Subject:rabic OCR response</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">May I recommend IRIS's latest Arabic OCR product? It costs about</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">$500. I just tested it a little while ago, and it works reasonably</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">well. But bear in mind you still have to read through everything,</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">don't trust the program to transcribe a full text error-free for one</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">second!</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">A more costly alternative is Sakhr OCR, but it is harder to obtain.</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">al-Husein Madhany</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; "><A href="mailto:anm@uchicago.edu">anm@uchicago.edu</A></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; font: normal normal normal 12px/normal Helvetica; min-height: 14px; "><BR class="khtml-block-placeholder"></DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">--------------------------------------------------------------------------</DIV><DIV style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">End of Arabic-L: 15 Nov 2006</DIV></BODY></HTML>