OLAC Protocol for Metadata Harvesting

Steven Bird sb at UNAGI.CIS.UPENN.EDU
Thu Dec 27 15:25:38 UTC 2001


Folks,

Recently, Gary and I had some discussion on supporting multiple languages
in the archive description (i.e. collection-level metadata), as defined in
the OLAC-PMH:

I wrote:
> I think we need to permit a lang attribute for the text-valued elements.
> The other option - specifying that English will be used - is likely to
> be unacceptable.  This then raises the possibility that archives might
> want to provide these text-valued elements in multiple languages, which
> starts to sound painful.  Do you have any thoughts on this?

Gary wrote:
> This is a good point.  After pondering it a bit, I think the way to do it
> would not be field by field, but for the whole record.  <description> is
> already multiply occurring, so if we just add a lang attribute to
> <olac-archive>, then people could generate as many archive descriptions as
> they want in as many languages as they want.  That would make the
> olac-archive correspond to a table in a service provider's database that
> would be in a one-to-many relationship with the archives table, which would
> be a whole lot easier than handling one-to-many at the field level.

I agree that the lang attribute should be specified at the level of the
<olac-archive> record.  Here is a mock-up for the National Archives of
Canada / Archives nationales du Canada.  Lines that differ in the two
versions are prefaced with an asterisk.

  <description>
     <olac-archive
        xmlns="http://www.language-archives.org/OLAC/0.4/olac-archive"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.language-archives.org/OLAC/0.4/olac-archive
                      http://www.language-archives.org/OLAC/0.4/olac-archive.xsd"
*       lang="en"
        type="institutional">
*      <archiveURL>http://www.archives.ca/02/0201_e.html</archiveURL>
*      <curator>Mr. Ian E. Wilson</curator>
*      <curatorTitle>National Archivist of Canada</curatorTitle>
*      <institution>National Archives of Canada</institution>
       <institutionURL>http://www.archives.ca/</institutionURL>
*      <location>395 Wellington Street, Ottawa, Ontario K1A 0N3, CANADA</location>
       <synopsis></synopsis>
       <access></access>
    </olac-archive>
    <olac-archive
        xmlns="http://www.language-archives.org/OLAC/0.4/olac-archive"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="http://www.language-archives.org/OLAC/0.4/olac-archive
                      http://www.language-archives.org/OLAC/0.4/olac-archive.xsd"
*       lang="fr"
        type="institutional">
*      <archiveURL>http://www.archives.ca/02/0201_f.html</archiveURL>
*      <curator>M. Ian E. Wilson</curator>
*      <curatorTitle>Archiviste national du Canada</curatorTitle>
*      <institution>Archives nationales du Canada</institution>
       <institutionURL>http://www.archives.ca/</institutionURL>
*      <location>395, rue Wellington, OTTAWA (Ontario) K1A 0N3, CANADA</location>
       <synopsis></synopsis>
       <access></access>
    </olac-archive>
 </description>

It would be a best practice for each version of the record to provide
semantically equivalent information.

Please let us know if anyone sees a problem with this simple approach to
supporting multiple languages for collection-level metadata.

Please post any responses directly to the list (simply by replying to the
email).

Steven Bird

--
Steven.Bird at ldc.upenn.edu  http://www.ldc.upenn.edu/sb
Assoc Director, LDC; Adj Assoc Prof, CIS & Linguistics
Linguistic Data Consortium, University of Pennsylvania
3615 Market St, Suite 200, Philadelphia, PA 19104-2608



More information about the Olac-implementers mailing list