Archiving course at the LSA institute?

Heidi Johnson hjohnson at MAIL.UTEXAS.EDU
Tue Apr 13 20:04:28 UTC 2004


Steven wrote:
> I think an LSA course on language archiving would be a great idea.
Me too. We need a forum that will allow time for hands-on work
with the recommended practices that we've been developing. Also, we
should try to have some kind of best practice/archiving presence
at every large gathering of linguists, to the extent possible.

> We could flesh out the title borrowing from OLAC's vision statement:
> "Best current practice for the digital archiving of language resources".
Or how about
"Best practice for the creation and archiving of digital language resources"

in case there are people who would feel excluded by not having an
archive. Also, we want people to produce archivable stuff from the start.

>
> My main concern would be creating the required content.  The required
> knowledge isn't concentrated in any one individual.
Couldn't we employ a tag-team approach to instructors? If we plan
the schedule wisely, we could have different sets of people teaching
each week, so no one would have to stay the whole time. Looking at
last year's schedule, I see that it's possible to offer a 3-week
course that meets 4 days a week. That would probably be good for
our purposes in terms of number of class hours.

> There's a diverse range of issues to cover and most of them lack best
> practice recommendations, e.g.:
>
> o  collection and curation of audio and video
> o  integration with existing field methods
> o  rights management, publication citation
> o  hardware and software
> o  preservation and access
> o  legacy data
>
> Where will the content come from?  Someone might object that most of the
> above isn't specifically to do with archiving, but I believe our focus
> has to be on creating archive-quality documentation in the first place.
Absolutely! And organizing those field corpora so that the archivists
can figure what everything is when they finally get the stuff.

> When Heidi was in Melbourne recently we discussed the components of such
> a course, and came up with something like the following:
>
> o  principles - what people should know in order to make informed
> choices concerning available technologies and practices
> (cf Chilin Shih's lecture on digital audio at the
> 2003 EMELD workshop in Michigan)
Also tech & practice concerning text (XML, UNICODE) and even rights.
Shibboleth is coming soon to all of our universities, and it looks very
promising in terms of facilitating the kind of finely-grained rights
management that we would like to have. For this section, "principles"
would include basic facts about copyright around the world, and what
kinds of restrictions/protocols are technologically feasible.
>
> o  recommendations - detailed suggestions concerning the above issues
> (e.g. specific microphones
, metadata schemas, corpus management strategies, text formats,etc.)
>
> o  tools - tutorials on a selection of useful fieldwork software and
> discussion of (in)appropriate use
>
> o  workshop - reports from participants on their own experiences with
> creating and preserving digital language documentation
>
>
Steven, as usual, has organized a big mess of issues into a few neat
coherent, manageable categories! I was also thinking that we could use
the Portability paper (Bird & Simon, 2003) as a guide for organizing
the workshop and as a major reading, of course. Somehow we would want
to hit all of those points.

This is already enough for a one-page plan. I doubt that we have to
produce a syllabus this far ahead of time (do we?). If we
could get reasonably firm-ish commitments from a set of instructors
who could cover the subject areas, that would probably be enough for
a proposal, wouldn't it?

In thinking of potential instructors, we might want to look at the
set of topics in terms of areas of expertise. For example, a phonetician
like Chilin Shih or Bartek Plischka could talk about audio recording
and analysis and conversion, from the principles level to the tools level.
There must be similar people for rights, and texts, and  databases, etc.
(Do we know anyone who knows anything about video?)

So we need people who can teach best practice, from principles to tools,
in at least these areas (adapting Steven's first list):
* resource creation: audio, text, database, video (?), others?
* rights management, citation guidelines
* corpus management (sets, bundles, relations, and labelling schemes)
* integration with existing field methods (how to add all this stuff
   to your project)
* preservation & access: how to build an archive
* legacy data

The last couple of class days could be the workshop component (or the
first, maybe, as kind of an introductory activity.) We might also
consider a day or two specifically about establishing an archive, for
participants who are planning or dreaming about doing that.

It's a lot, but doable, I think. Is it possible to get people to sign
on for teaching duty this far in advance? While writing I've just decided
that I could spend 3 weeks in Boston if necessary... the dog will just
have to go be spoiled rotten at my parents' house for the duration. So
I can volunteer for corpus management and maybe rights management, if
we can't find anyone better.

Heidi



More information about the Olac-outreach mailing list