OCR advice?

Michael A. Denner mad197 at LULU.ACNS.NWU.EDU
Tue Feb 1 23:22:34 UTC 2000


Mark: read this for me & offer any advice. Questions I should ask? Gaffes
I've made?

Dear SEELANGers,

I'm about to begin a project that will involve converting 200+ pages of
Cyrillic text into something that will eventually be in HTML format. Since
there are many tech-savvy people and companies that read this usenet, I
thought I'd start here.

I'm looking for any recommendations for OCR software: The text that needs to
be scanned is clear & fairly homogeneous, but it's poetry, so formatting is
a complicated affair. Since this will eventually be used in HTML documents,
the scanner should convert the text into  (I think) KOI-8, preferably to
other formats as well (like the MAC- or PC-related codes for HTML editing in
Cyrillic). Ideally, it should scan directly into Microsoft Word, since I've
had good luck converting Cyrillic documents from Word to DreamWeaver (the
HTML editor I use).

Has anyone had any experience with OCR technology? Any problems using the
data in HTML format? Does Microsoft have integrated software to use with
Cyrillic? Any and all advice appreciated. Please respond off list, unless
you believe that your response will be of general interest.

Michael A. Denner
Northwestern University


+++***+++
the preacher should shout... with thundering voice: "'pause, avast, why so
seeming fast, but deadly slow?'"
thoreau. walden. 1854.

-------------------------------------------------------------------------
 Use your web browser to search the archives, control your subscription
  options, and more.  Visit and bookmark the SEELANGS Web Interface at:
                http://members.home.net/lists/seelangs/
-------------------------------------------------------------------------



More information about the SEELANG mailing list