News from the Open Language Archives Community (OLAC)

Steven Bird olac-admin at
Sat Sep 27 10:56:53 UTC 2003

Dear Community,

Here is a summary of the developments in the Open Language
Archives Community since our last general news posting in June.
Full details are available at:


At the last OLAC workshop in Philadelphia (December 2002), the OLAC
voting process was replaced with a council.  The OLAC Council is made
up of people who have experiential knowledge of OLAC and who make
decisions about OLAC standards, best practices and repositories as
described in the document process and registration process.  The OLAC
Council has now been formed, and has the following members: Anthony
Aristar (LINGUIST), Chris Cieri (LDC), Gary Holton (ANLC), Chu-Ren
Huang (Academia Sinica), Heidi Johnson (AILLA), Laurent Romary
(ATILF), Joan Spanne (SIL) and Martin Wynne (OTA).


OLAC standards concern the operation of OLAC's core infrastructure
(protocols and processes) and are mostly of concern to digital
archivists.  After an extended period of experimentation and review by
the community, the OLAC Process and OLAC Repositories standards have
now been adopted.  The process document summarizes the governing ideas
of OLAC and describes how OLAC is organized and how it operates,
including the document process and working group process.  The
repositories document defines the standards OLAC archives must follow
in implementing a metadata repository.  The metadata document is still
in "candidate" status, and will shortly undergo review on the
OLAC-Implementers mailing list.

OLAC Process (Adopted standard, 2003-07-08):
OLAC Repositories (Adopted standard, 2003-09-17):
OLAC Metadata (Candidate standard, 2003-05-31):

OLAC-Implementers mailing list:


Chu-Ren Huang (Academia Sinica) presented OLAC at the ENABLER/ELSNET
Workshop "International Roadmap for Language Resources", held in Paris
in August.  Details about the workshop, and Chu-Ren's presentation, are
available at:


The Rosetta ALL Language Archive project has been awarded close to
US$1M over two years from the NSF National Science Digital Library
Initiative.  The project is a collaboration between the Long Now
Foundation, the LINGUIST List, Stanford University, Eastern Michigan
University, the Open Language Archives Community, and the Endangered
Language Fund.  For more information please see the award abstract:


The following research publications concerning OLAC will appear in
2003.  All are available from the documents section of the OLAC website.

The Open Language Archives Community: An infrastructure for
  distributed archiving of language resources, Literary and Linguistic
  Computing 18(1), Special Issue on New Directions in Humanities
  Computing, 2003.

Building an Open Language Archives Community on the OAI foundation,
  Library Hi Tech 21(2), Special Issue on the Open Archives
  Initiative, 2003.

Extending Dublin Core Metadata to support the description and
  discovery of language resources, to appear in Computing and the
  Humanities 37(4), 2003.

Seven dimensions of portability for language documentation and
  description, Language 79, 557-82, 2003.

OLAC Documents page:


More language archives have implemented the OLAC 1.0 metadata format.
Here is the full list of current archives:

A Digital Archive of Research Papers in Computational Linguistics,
  Philadelphia, USA
Aboriginal Studies Electronic Data Archive
  Australian Institute of Aboriginal and Torres Strait
  Islander Studies, Australia
Academia Sinica Formosan Language Archive, Academia Sinica
  Taipei, Taiwan
Archive of the Indigenous Languages of Latin America, UT Austin,
  Austin, USA
ATILF Resources, Analyse et Traitement Informatique de la Langue Francaise
  Nancy, France
Cornell Language Acquisition Laboratory, Cornell University
  Ithaca, New York
Ethnologue: Languages of the World, SIL International,
  Dallas, USA
European Language Resources Association,
  Paris, France
Flint Archive, University of Queensland,
  Brisbane, Australia
LACITO Archive, Langues et Civilisations à Tradition Orale,
  Villejuif, France
Linguistic Data Consortium, University of Pennsylvania
  Philadelphia, USA
Natural Language Software Registry, German Foundation for Artificial Intelligence,
  Saarbrucken, Germany
  Pacific And Regional Archive for Digital Sources in Endangered Cultures, Australia
Perseus Digital Library, Tufts University,
  Medford, USA
Rosetta Project 1000 Language Archive, Long Now Foundation,
  San Francisco, USA
Surrey Morphology Group Databases, University of Surrey,
  Guildford, UK
Survey for California and Other Indian Languages, UC Berkeley,
  San Francisco, USA
TalkBank, Carnegie Mellon University,
  Pittsburgh, USA
SIL Language and Culture Archives,
  Dallas, USA
Tibetan and Himalayan Digital Library, University of Virginia,
  Charlottesville, USA
TRACTOR Archive, Trans-European Language Resources Infrastructure,
  Oxford, UK
Typological Database Project, Utrecht University,
  Utrecht, Netherlands

The content of these archives can be searched using the OLAC interface
on the LINGUIST List site at:

(Note that some formerly-registered archives are no longer on the list
as they do not conform to the OLAC 1.0 standard.  Adminstrators of
those archives should review the OLAC Metadata and OLAC Repositories
documents, update their repositories, and re-register with OLAC.)

Best wishes,
Steven & Gary
Steven Bird, University of Melbourne (sb at
Gary Simons, SIL International (gary_simons at
OLAC Coordinators (

More information about the Olac-general mailing list