<div dir="ltr">------------------------------------------------------------------------<br>Arabic-L: Thu 01 Nov 2012<br>Moderator: Dilworth Parkinson <<a href="mailto:dilworth_parkinson@byu.edu" target="_blank">dilworth_parkinson@byu.edu</a>><br>




[To post messages to the list, send them to <a href="mailto:arabic-l@byu.edu" target="_blank">arabic-l@byu.edu</a>]<br>[To unsubscribe, send message from same address you subscribed from to<br><a href="mailto:listserv@byu.edu" target="_blank">listserv@byu.edu</a> with first line reading:<br>




           unsubscribe arabic-l                                      ]<br><br>-------------------------Directory------------------------------------<br><br>1) Subject:GEN:Arabic OCR<br><br>-------------------------Messages-----------------------------------<br>




1)<br>Date: 01 Nov 2012<br>From:<span style="font-family:arial,sans-serif;font-size:13px">Saqer Almarri <<a href="mailto:saqer.almarri@gmail.com" target="_blank">saqer.almarri@gmail.com</a>></span><br>Subject:Arabic OCR<br>


<br><span style="font-family:arial,sans-serif;font-size:13px">I recently found out that Tesseract-OCR (which Google uses) supports</span><br style="font-family:arial,sans-serif;font-size:13px">
<span style="font-family:arial,sans-serif;font-size:13px">Arabic. See here: </span><a href="http://code.google.com/p/tesseract-ocr/" style="font-family:arial,sans-serif;font-size:13px" target="_blank">http://code.google.com/p/tesseract-ocr/</a><span style="font-family:arial,sans-serif;font-size:13px"> However,</span><br style="font-family:arial,sans-serif;font-size:13px">



<span style="font-family:arial,sans-serif;font-size:13px">this is just the engine, you can use it with OCRFeeder as a frontend</span><br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">(available on Linux only, not available on Windows or Mac)</span><br style="font-family:arial,sans-serif;font-size:13px">



<a href="https://live.gnome.org/OCRFeeder" style="font-family:arial,sans-serif;font-size:13px" target="_blank">https://live.gnome.org/OCRFeeder</a><br style="font-family:arial,sans-serif;font-size:13px"><br style="font-family:arial,sans-serif;font-size:13px">



<span style="font-family:arial,sans-serif;font-size:13px">Tesseract works really well with English, but is still buggy with</span><br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">Arabic. It's a step into the right direction, and considering both</span><br style="font-family:arial,sans-serif;font-size:13px">



<span style="font-family:arial,sans-serif;font-size:13px">Tesseract-OCR & OCRFeeder are opensource, those of you who work with</span><br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">computational linguistics can contribute.</span><br style="font-family:arial,sans-serif;font-size:13px">



<br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">Regards,</span><br style="font-family:arial,sans-serif;font-size:13px"><span style="font-family:arial,sans-serif;font-size:13px">Saqer</span><br style="font-family:arial,sans-serif;font-size:13px">



<br>--------------------------------------------------------------------------<br>End of Arabic-L: 01 Nov 2012<br></div>