Obras Clasicas sobre la Lengua Nahuatl

Henry Kammler henry.kammler at stadt-frankfurt.de
Tue Jul 6 08:21:57 UTC 1999


> The way to translate scanned images into text is to use OCR (Optical
> Character Recognition) software. Unfortunately, while OCR software has come
> a long way in functionality in the last five or so years, it is still
> relatively limited in scope. It reads only certain printed fonts and only
> if they are very clearly arranged, easy to read, and conform to industry
> standards.
OCR software has gone a long way already. A quite powerful program
that I use is Fine Reader Pro (a software from Russia). It can be
trained to read a wide range of glyphs. After an initial phase of
"training" (even something like telling it to read every X as U --
after three or so Xs it will read all Xs as Us) the recognition rate
is very high. I haven't used it with handwriting but with very blurry
typoscript that used non-standard phonetic symbols and I was surprised
how well it worked after some training runs. As you can freely define
the "borders" of the glyphs while training it (it keeps the main
variants of the letters' shapes in a database), it should work with
handwriting. It can handle ligatures without any problems.
Consider how much text by the various scribes you have, though,
because every individual hand would require setting up such a database
first.

Probably I missed this: where is that CD available and how much is it?

Cheers,
Henry



More information about the Nahuat-l mailing list