Corpora: Summary: Corpus metadata

Steven Bird sb at unagi.cis.upenn.edu
Mon Jun 24 16:01:41 UTC 2002


Mikko Lounela wrote:
> about two weeks ago I posted a query about corpus metadata. I also
> promised to post a summary. Thank you very much for the answers (total
> 8), and here is the summary.

Two of these messages mentioned OLAC, the Open Language Archives Community.
The Linguistic Data Consortium now documents all of its corpora using the
OLAC metadata set.  Other language resource institutions are involved,
including ATILF, DFKI, ELRA, LINGUIST, SIL, and more than a dozen others.

The benefits of using OLAC metadata are that it is very easy to use and
the infrastructure for indexing and search is already in place.  Please see
www.language-archives.org for full details.

Steven Bird

--
Steven.Bird at ldc.upenn.edu  http://www.ldc.upenn.edu/sb
Assoc Director, LDC; Adj Assoc Prof, CIS & Linguistics
Linguistic Data Consortium, University of Pennsylvania
3615 Market St, Suite 200, Philadelphia, PA 19104-2608



More information about the Corpora mailing list