11.2637, Qs: Korean Corpus, Cyrillic OCR

The LINGUIST Network linguist at linguistlist.org
Wed Dec 6 04:09:19 UTC 2000


LINGUIST List:  Vol-11-2637. Tue Dec 5 2000. ISSN: 1068-4875.

Subject: 11.2637, Qs: Korean Corpus, Cyrillic OCR

Moderators: Anthony Aristar, Wayne State U.<aristar at linguistlist.org>
            Helen Dry, Eastern Michigan U. <hdry at linguistlist.org>
            Andrew Carnie, U. of Arizona <carnie at linguistlist.org>

Reviews: Andrew Carnie: U. of Arizona <carnie at linguistlist.org>

Editors: Karen Milligan, Wayne State U. <karen at linguistlist.org>
         Michael Appleby, E. Michigan U. <michael at linguistlist.org>
         Rob Beltz, E. Michigan U. <rob at linguistlist.org>
         Lydia Grebenyova, E. Michigan U. <lydia at linguistlist.org>
         Jody Huellmantel, Wayne State U. <jody at linguistlist.org>
         Marie Klopfenstein, Wayne State U. <marie at linguistlist.org>
	 Naomi Ogasawara, E. Michigan U. <naomi at linguistlist.org>
	 James Yuells, Wayne State U. <james at linguistlist.org>
         Ljuba Veselinova, Stockholm U. <ljuba at linguistlist.org>

Software: John Remmers, E. Michigan U. <remmers at emunix.emich.edu>
          Gayathri Sriram, E. Michigan U. <gayatri at linguistlist.org>

Home Page:  http://linguistlist.org/

The LINGUIST List is funded by Eastern Michigan University, Wayne
State University, and donations from subscribers and publishers.


Editor for this issue: Karen Milligan <karen at linguistlist.org>
 ==========================================================================

We'd like to remind readers that the responses to queries are usually
best posted to the individual asking the question. That individual is
then  strongly encouraged to post a summary to the list.   This policy was
instituted to help control the huge volume of mail on LINGUIST; so we
would appreciate your cooperating with it whenever it seems appropriate.

=================================Directory=================================

1)
Date:  Mon, 4 Dec 2000 19:25:46 +0300
From:  "Elena Rudnitskaya" <erudnits at mtu-net.ru>
Subject:  Korean Corpus

2)
Date:  Mon, 4 Dec 2000 15:19:03 EST
From:  JeffPower at aol.com
Subject:  Cyrillic OCR

-------------------------------- Message 1 -------------------------------

Date:  Mon, 4 Dec 2000 19:25:46 +0300
From:  "Elena Rudnitskaya" <erudnits at mtu-net.ru>
Subject:  Korean Corpus

Dear Linguists,

I am looking for a Korean Corpus in the Internet. If you happen to know any,
please let me know its www-address.

Elena Rudnitskaya
erudnits at mtu-net.ru


-------------------------------- Message 2 -------------------------------

Date:  Mon, 4 Dec 2000 15:19:03 EST
From:  JeffPower at aol.com
Subject:  Cyrillic OCR

Ladies and gentlemen:

I am involved in a large project to develop, modify, or obtain an OCR
tool that can recognize strings of handwritten Cyrillic characters that were
recorded in columns on ledgers or logs. Data on a given CD would be for a
given year, in sequence chronologically by type of event (i.e., birth, death,
divorce, marriage).  Data would include given names, surnames, family
information, dates of birth, death, towns, etc.  Such a tool would enable
users to search for their family surnames and towns for family research
purposes. The software would have to be capable of recognizing most
reasonably well-formed handwritten Cyrillic characters.

The data will be CD-ROM formatted, after being transferred from microfilms of
paper records.  I may have some influence over the preparation of the CD's,
so any suggestions for formatting requirements would be most helpful.

I would be interested in hearing from anyone knowing about these issues. What
can you tell me about the software used for these purposes, any of your own
experiences with or knowledge of such projects, and any advice that you can
give on how I should proceed?

Thanks!!
Jeff Miller
E-mail: singingtm at aol.com, or, jeff at leaderfun.com
Maryland, US

---------------------------------------------------------------------------
LINGUIST List: Vol-11-2637



More information about the LINGUIST mailing list