example uses of digitized material - brave against the enemy

Pat Warren warr0120 at umn.edu
Mon Jan 26 07:59:18 UTC 2004


Hi John,

Thanks for giving me some feedback!

> Very nice!  And nice looking, too.  I take it that the text format
> material is the OCR version?

Yes, it's a slightly proofed version of the ocr results.

> This makes me realize that Dorsey's two published text collections (and
> the LaFlesche ms texts in the APS) are probably as important to scan as
> the microfilm of the Dorsey ms material.

As I've been moving along in this project the scope of what is important to
scan has been increasing exponentially. I want to set in motion the
digitization of the most important sources for Siouan languages for now.
But I think as people start to use this kind of thing later this year their
sights will keep going higher like mine do. This is really a logical way to
go with digitization.

> I have tended to think of this
> project in terms of how to get access to something I have difficulty
> accessing, not in the larger terms of how to make it universally and
> conveniently accessible.  This approach not only makes the material
> accessible to specialists who make a certain level of effort, but,
really,
> to everyone.  It solves the publication problem as well as the manuscript
> access problem.

Oh goodie, now someone else gets it! This has been a goal all along, to be
able to create a digital environment for this data that could be
customizable for different peoples' need: students, scholars, speakers...
Universal access is the way to go!  It's all set up to work as a general
digitization project for any medium or subject. And it's also intended to
make what would seem like "different" projects (like Algonquian and Siouan
language resources) be automatically intergrated to create larger databases
and networks. e.g. a preliminary (and constantly updated or else a static)
comparative Algonquian-Siouan dictionary could be automatically generated
from coded xml text versions of the sources, and an xsl stylesheet written
to display it as an edited/annotated version of the automatically generated
data with extra data added where necessary, which would in turn become
input for other derivative works. I'm sure most people aren't going to
understand what I mean until they see it in action though. Soon.

> User Notes
>
> I had a little trouble at first grasping the navigational system - I've
> always been a bit dullwitted about icons - but once I had the suggested
> font installed and understood the icons (two arrows means further in a
> relevant direction than one arrow, like on music players all over the
> world) and saw the structure of the site:  home page > index page >
> material pages, with material in one of the six presentation formats
> selected in the home page, and the index organized accordingly, I was OK.
> You can stay in one format, or switch back and forth as desired.  It
might
> help if there were some of those help boxes that you get by hovering, or
> if the home page said explicitly "Select a format." Maybe the index pages
> could say index page for format x.

Yes, having help available for every inch of the website will be very
important. I plan to have a very detailed plain language help link on each
page that will conditionally explain how to use the page based on what the
page is and contains. Because of the way I'm designing absolutely
everything in xml this will be real easy to do. I'll also have a
preferences option so you can set the mode you want to work in, like
"beginner's mode" which will have labels on everything so you can figure
out the navigation easy and then get rid of the labels when you catch on.
And tooltips will be there too (the text that shows on hover).

In fact, these are only six options out of infinite possibilities. I've
spent a lot of time working on theoretical design issues and data modeling
because I have plans for the various levels of digitization that are
extremely ambitious. The whole project is designed to be segmentable, so
you can create versions with only a few features (e.g. only text or only
images) or many features depending on your interests and what sources you
want to include. It's also all designed to be progressive yet fully
functional at all times, so new things can always be added and even works
in progress can be accessible with description of what is and isn't done.
Once it's up and running there'll never be any "in construction"
annoyances. Like with the split screen idea. Eventually I want that to be a
proofing screen, so anyone who wants to could work on proofing the ocr
results of any page of any scanned source right online and submit their
updates right from that page. The whole thing will be interactive as both a
way to view and input data, kind of a really souped up archival version of
shoebox. There's a ton of other aspects to this, but I wanted to at least
get some of the simpler uses of what I'm doing out there for some people to
see. Sometime later this year I think I'll have the core programming done
and have a fully functional version available for a few sources, with a
full manual / intro / guide describing the whole project, the technical
details, etc.

Take care,
Patrick



More information about the Siouan mailing list