From sb at cs.mu.oz.au Fri Jun 13 02:51:09 2003 From: sb at cs.mu.oz.au (Steven Bird) Date: Fri, 13 Jun 2003 12:51:09 +1000 Subject: News from the Open Language Archives Community (OLAC) Message-ID: Dear Community, There have been many significant developments in the Open Language Archives Community since our last general news posting in July 2002. Here is a summary; full details are available on the OLAC website at http://www.language-archives.org/ OLAC WORKSHOP Last December the IRCS Workshop on Open Language Archives was held at the University of Pennsylvania. This meeting revised core OLAC standards and established a new metadata format, version 1.0. Full details, including an online proceedings, are available. Workshop website: http://www.language-archives.org/events/olac02/ CURRENT OLAC ARCHIVES Here is the full list of current archives: A Digital Archive of Research Papers in Computational Linguistics, Philadelphia, USA Academia Sinica Formosan Language Archive, Taipei, Taiwan Australian Studies Electronic Data Archive, Canberra, Australia Archive of the Indigenous Languages of Latin America, UT Austin, Austin, USA ATILF Resources, Nancy, France Cornell Language Acquisition Laboratory, Ithaca, New York Ethnologue: Languages of the World, SIL International, Dallas, USA European Language Resources Association, Paris, France Rosetta Project 1000 Language Archive, Long Now Foundation, San Francisco, USA Surrey Morphology Group Databases, University of Surrey, Guildford, UK Survey for California and Other Indian Languages, UC Berkeley, San Francisco, USA TalkBank, Carnegie Mellon University, Pittsburgh, USA The Linguistic Data Consortium Corpus Catalog, Philadelphia, USA The Natural Language Software Registry, Saarbrucken, Germany The Typological Database Project, Utrecht, Netherlands Tibetan and Himalayan Digital Library, University of Virginia, Charlottesville, USA TRACTOR Archive, Oxford, UK Flint Archive, University of Queensland, Brisbane, Australia (Note that some formerly-registered archives are no longer on the list as they do not conform to the OLAC 1.0 standard. Adminstrators of those archives should review the OLAC Metadata and OLAC Repositories documents, update their repositories, and re-register with OLAC.) OLAC IN THE NEWS Over the last year OLAC has featured in articles in Scientific American, Wired News, and the BBC World Service. Please see the news section of the website for pointers. OLAC News page: http://www.language-archives.org/news.html FEEDBACK SOUGHT ON OLAC DOCUMENTS We invite comment from the wider language resources community on OLAC's proposed standards and recommendations. The standards concern the operation of OLAC's core infrastructure (protocols and processes) and are mostly of concern to digital archivists. The standards are discussed on the OLAC-Implementers mailing list. The recommendations, on the other hand, concern best practices in language resource description, and are mostly of concern to institutions and individuals who create and use language resources. The recommendations are discussed on the METADATA mailing list. Proposed standards: OLAC Metadata (2002-12-11): http://www.language-archives.org/OLAC/metadata.html OLAC Process (2002-12-10): http://www.language-archives.org/OLAC/process.html OLAC Repositories (2003-05-28): http://www.language-archives.org/OLAC/repositories.html OLAC-Implementers mailing list: http://lists.linguistlist.org/archives/olac-implementers.html Proposed recommendations: Recommended metadata vocabularies for Discourse Types, Language Identification, Linguistic Field, Linguistic Data Types and Participant Roles: http://www.language-archives.org/REC/olac-extensions.html METADATA mailing list: http://lists.linguistlist.org/archives/metadata.html NEW OLAC INFRASTRUCTURE Changes in OLAC standards, and also in underlying standards from the Open Archives Initiative and the Dublin Core Metadata Initiative, have required far-reaching changes in OLAC infrastructure. Over the last six months we have re-implemented all of the software infrastructure on the OLAC website. This work has been supported by the NSF EMELD and Talkbank projects. As a consequence, it is now easier than ever to set up an institutional or individual metadata repository (i.e. resource catalog) and register it with OLAC. The simplest method is to create an XML file describing language resources, post it on a website, and register it with OLAC. Such catalogs are checked twice daily by the OLAC harvester, and any changes are incorporated into the central resource catalog maintained on the OLAC site. This is then made available to other services including the LINGUIST site. EMELD Project: Electronic Metastructure for Endangered Languages Data http://www.emeld.org/ Talkbank Project http://www.talkbank.org/ OLAC Service Provider at LINGUIST http://www.linguistlist.org/olac RESEARCH PUBLICATIONS CONCERNING OLAC The following research publications concerning OLAC will appear in 2003. All are available from the documents section of the OLAC website. The Open Language Archives Community: An infrastructure for distributed archiving of language resources, Literary and Linguistic Computing 18(1), Special Issue on New Directions in Humanities Computing, 2003. Building an Open Language Archives Community on the OAI foundation, Library Hi Tech 21(2), Special Issue on the Open Archives Initiative, 2003. Extending Dublin Core Metadata to support the description and discovery of language resources, to appear in Computing and the Humanities 37, 2003. Seven dimensions of portability for language documentation and description, to appear in Language 79, 2003. OLAC Documents page: http://www.language-archives.org/documents.html OLAC WORKING GROUP ON OUTREACH: CALL FOR PARTICIPATION The OLAC Working Group on Outreach will raise awareness of the activities and resources of OLAC by facilating the production of general-audience documents describing various aspects of OLAC and by contacting individuals and organizations who manage archives but are not yet part of OLAC. The group has the following working draft: A Gentle Introduction to Metadata (Jeff Good) The group is conducting its work on the OLAC-OUTREACH mailing list which is hosted on the LINGUIST site. To learn more and to join the group, please see the Outreach Working Group page. OLAC Outreach Working Group: http://www.language-archives.org/wg/outreach/ Best wishes, Steven & Gary ________ Steven Bird, U Melbourne and U Pennsylvania (sb at ldc.upenn.edu) Gary Simons, SIL International (gary_simons at sil.org) OLAC Coordinators (www.language-archives.org) From sb at cs.mu.oz.au Fri Jun 13 02:51:09 2003 From: sb at cs.mu.oz.au (Steven Bird) Date: Fri, 13 Jun 2003 12:51:09 +1000 Subject: News from the Open Language Archives Community (OLAC) Message-ID: Dear Community, There have been many significant developments in the Open Language Archives Community since our last general news posting in July 2002. Here is a summary; full details are available on the OLAC website at http://www.language-archives.org/ OLAC WORKSHOP Last December the IRCS Workshop on Open Language Archives was held at the University of Pennsylvania. This meeting revised core OLAC standards and established a new metadata format, version 1.0. Full details, including an online proceedings, are available. Workshop website: http://www.language-archives.org/events/olac02/ CURRENT OLAC ARCHIVES Here is the full list of current archives: A Digital Archive of Research Papers in Computational Linguistics, Philadelphia, USA Academia Sinica Formosan Language Archive, Taipei, Taiwan Australian Studies Electronic Data Archive, Canberra, Australia Archive of the Indigenous Languages of Latin America, UT Austin, Austin, USA ATILF Resources, Nancy, France Cornell Language Acquisition Laboratory, Ithaca, New York Ethnologue: Languages of the World, SIL International, Dallas, USA European Language Resources Association, Paris, France Rosetta Project 1000 Language Archive, Long Now Foundation, San Francisco, USA Surrey Morphology Group Databases, University of Surrey, Guildford, UK Survey for California and Other Indian Languages, UC Berkeley, San Francisco, USA TalkBank, Carnegie Mellon University, Pittsburgh, USA The Linguistic Data Consortium Corpus Catalog, Philadelphia, USA The Natural Language Software Registry, Saarbrucken, Germany The Typological Database Project, Utrecht, Netherlands Tibetan and Himalayan Digital Library, University of Virginia, Charlottesville, USA TRACTOR Archive, Oxford, UK Flint Archive, University of Queensland, Brisbane, Australia (Note that some formerly-registered archives are no longer on the list as they do not conform to the OLAC 1.0 standard. Adminstrators of those archives should review the OLAC Metadata and OLAC Repositories documents, update their repositories, and re-register with OLAC.) OLAC IN THE NEWS Over the last year OLAC has featured in articles in Scientific American, Wired News, and the BBC World Service. Please see the news section of the website for pointers. OLAC News page: http://www.language-archives.org/news.html FEEDBACK SOUGHT ON OLAC DOCUMENTS We invite comment from the wider language resources community on OLAC's proposed standards and recommendations. The standards concern the operation of OLAC's core infrastructure (protocols and processes) and are mostly of concern to digital archivists. The standards are discussed on the OLAC-Implementers mailing list. The recommendations, on the other hand, concern best practices in language resource description, and are mostly of concern to institutions and individuals who create and use language resources. The recommendations are discussed on the METADATA mailing list. Proposed standards: OLAC Metadata (2002-12-11): http://www.language-archives.org/OLAC/metadata.html OLAC Process (2002-12-10): http://www.language-archives.org/OLAC/process.html OLAC Repositories (2003-05-28): http://www.language-archives.org/OLAC/repositories.html OLAC-Implementers mailing list: http://lists.linguistlist.org/archives/olac-implementers.html Proposed recommendations: Recommended metadata vocabularies for Discourse Types, Language Identification, Linguistic Field, Linguistic Data Types and Participant Roles: http://www.language-archives.org/REC/olac-extensions.html METADATA mailing list: http://lists.linguistlist.org/archives/metadata.html NEW OLAC INFRASTRUCTURE Changes in OLAC standards, and also in underlying standards from the Open Archives Initiative and the Dublin Core Metadata Initiative, have required far-reaching changes in OLAC infrastructure. Over the last six months we have re-implemented all of the software infrastructure on the OLAC website. This work has been supported by the NSF EMELD and Talkbank projects. As a consequence, it is now easier than ever to set up an institutional or individual metadata repository (i.e. resource catalog) and register it with OLAC. The simplest method is to create an XML file describing language resources, post it on a website, and register it with OLAC. Such catalogs are checked twice daily by the OLAC harvester, and any changes are incorporated into the central resource catalog maintained on the OLAC site. This is then made available to other services including the LINGUIST site. EMELD Project: Electronic Metastructure for Endangered Languages Data http://www.emeld.org/ Talkbank Project http://www.talkbank.org/ OLAC Service Provider at LINGUIST http://www.linguistlist.org/olac RESEARCH PUBLICATIONS CONCERNING OLAC The following research publications concerning OLAC will appear in 2003. All are available from the documents section of the OLAC website. The Open Language Archives Community: An infrastructure for distributed archiving of language resources, Literary and Linguistic Computing 18(1), Special Issue on New Directions in Humanities Computing, 2003. Building an Open Language Archives Community on the OAI foundation, Library Hi Tech 21(2), Special Issue on the Open Archives Initiative, 2003. Extending Dublin Core Metadata to support the description and discovery of language resources, to appear in Computing and the Humanities 37, 2003. Seven dimensions of portability for language documentation and description, to appear in Language 79, 2003. OLAC Documents page: http://www.language-archives.org/documents.html OLAC WORKING GROUP ON OUTREACH: CALL FOR PARTICIPATION The OLAC Working Group on Outreach will raise awareness of the activities and resources of OLAC by facilating the production of general-audience documents describing various aspects of OLAC and by contacting individuals and organizations who manage archives but are not yet part of OLAC. The group has the following working draft: A Gentle Introduction to Metadata (Jeff Good) The group is conducting its work on the OLAC-OUTREACH mailing list which is hosted on the LINGUIST site. To learn more and to join the group, please see the Outreach Working Group page. OLAC Outreach Working Group: http://www.language-archives.org/wg/outreach/ Best wishes, Steven & Gary ________ Steven Bird, U Melbourne and U Pennsylvania (sb at ldc.upenn.edu) Gary Simons, SIL International (gary_simons at sil.org) OLAC Coordinators (www.language-archives.org)