other languages? cost?

Pat Warren warr0120 at umn.edu
Thu Jan 8 04:48:07 UTC 2004


On 7 Jan 2004, Jimm GoodTracks wrote:
> Pat:
> Someone, I believe, Carolyn, was asking about doing other languages, like
> Osage.  Does your time permit you to consider languages other than
Dakota/
> Omaha-Ponca?  If so, what would be the cost per microfilm frame?
> Jimm

Hey Jimm,

Short answer:

Yes, I'm open to working on anything. Even more so if it's related to
something I'm already working on. Cost, zero. This work is too important to
involve money. The commitment has to come from personal motivation or it'll
be like other great ideas: when the funding runs out so does the interest
in doing the work.

Long (winded) answer:

Yes, my time permits going in any direction. I'm focusing on Dakota and
Ojibwe materials because I was learning those two languages at the U of MN
and realized the situation as far as written materials to help learners the
situation was sad. So my original goal two years ago was to get everything
on or in the language and culture gathered together so anyone could have
access to it. But this situation is true for most languages.

My interests are now more in developing the overall program as a
replicable, cooperative venture, and in broadening the application of the
kind of digitization and linking together of texts I'm doing. For example
the last few months I focused on scanning materials in Italian, German, and
French on Somali and Tigrinya, two east african languages that are spoken a
lot around the Twin Cities, since I also want to spend a good amount of my
time learning the languages that are used where I live, which was a major
reason for wanting to learn Dakota and Ojibwe.

I'm totally open to working on just about anything. Anything in Siouan or
Algonquian languages is an easy sell to me. I would love to work on Osage
materials. I really enjoy what I'm doing, except for some of the
programming, but that'll be less time-consuming soon. So if people have
materials they want digitized, let's get on it! After I finish the bulk of
the initial programming over the next couple of months and can focus on the
digitization itself, I see the following process emerging: I can focus on
the scanning of materials and the training of ocr software to work with the
different typefaces and quality levels, then I can post ocr results for
anyone to proof who's interested, then I can do the more detailed data
coding so texts can be combined and farmed. And there's the potential for
getting others set up to do any of this independently. It depends on how
much you want to and are able to commit.

At the Minnesota Historical Society today we talked about how it's great
that I've got these cds of images of Iapi Oaye, and that it's possibly the
most complete collection around, but that it'll be much more important as a
text document (and for many sources, like the innumerable slightly
differing versions of bibles and prayer books in native languages, may only
have value as full text data), and eventually as a highly detailed coded
database file that can be integrated with dictionaries, grammars,
ethnographies, histories, etc. Now, it's a big task to proof 70 years worth
of a monthly newspaper, but once the scanning is done and the ocr is run,
if there are several people working on different parts of it it can really
move along.

The project I've got going is at its core an open source, volunteer
project. I don't want any money involved with it. Much of the equipment I
use is public, and the cost is as close to zero as it can get. Just time.
But it's good work, and it's hard to avoid learning some Dakota when you
proof a whole dictionary. You get really familiar with the materials and
their content in doing this work, and that's a huge reward for me, plus
then I can make all this work available to others, and create a way for
lots of people to mesh their efforts together and make everyone's research
time more productive and increase the quality of the work. Just imagine
having the Siouan Languages List linked to all this, so every time you make
a citation from some source your email would link right to it. Or at least
getting everyone access to the same materials would make it that much more
useful when people share questions and insights.

So like I said about the Dorsey film, if I have it my hands I can scan it.
Not in a day or two, probably four or five months just for scanning and a
few more to process the images. If you have any other sources that are
really important to get scanned, let me know. The interlibrary loan
services here are great, and if materials are rare, just get em to me and I
can do the work. Eventually I'll have a user's guide for the whole project
so anyone can see all the equipment I use, my standards, my processing
techniques, how I code data and program the web interface, etc., so anyone
can do the same kind of thing without the two year development investment.

Pat



More information about the Siouan mailing list