From gary.holton at UAF.EDU Tue Apr 1 14:24:21 2008 From: gary.holton at UAF.EDU (Gary Holton) Date: Tue, 1 Apr 2008 06:24:21 -0800 Subject: Last call for review of new metadata documents Message-ID: [re-posting ... my original posting bounced] Dear Gary and others, Sorry to weigh in at the eleventh hour. This is a very interesting discussion and has certainly helped to clarify several issues for me. I want to comment specifically on Jeff's proposal to delete the line "A metadata repository should not degrade the 'signal-to-noise ratio' for language resource discovery." That sentence clarifies a crucial issue. Too much detail impedes rather aides search. Part of the problem may be that from an archive point of view, the library catalog is not the appropriate model. While I won't propose deleting reference to library catalogs in the OLAC documents, I think it is important to understand this distinction. A library strives to catalog every physical item, so that even a 2-volume book will show up as two entries in the catalog, even though it is logically one "work". A traditional archive catalogs collections, which may contain multiple individual documents. The metadata for each collection is in the form of a finding aid, generally a prose document which describes the items in the collection. The collection is also catalogued physically for the purposes of physically locating items (eg., shelf 12, box 2, folder 5), but descriptive metadata are not assigned at the box and folder level (though reference to specific folders may be made in the finderlist prose). I think the essence of the granularity discussion is that we want OLAC metadata to be more archive-like than library-like. In a sense, this resolves the hasTranscript discussion as well, because the transcript will generally be a part of the same collection as the resource which it is related to. In general, I find relations such as hasTranscript to be of limited usefulness because they rely on uniform coding by the cataloger, which is unlikely to happen. Any serious researcher wishing to look for transcripts will need to consult the entire collection rather than rely on a cataloger to identify those transcripts. In order for the brave new world of linguistics (as envisioned, for example, in Gary's plenary at EMELD 2007) to proceed we have to ensure that researchers point back to the original collection rather just mine the bit their browsers point to. Promoting coarse granularity will help to achieve that. Gary From gary.holton at UAF.EDU Tue Apr 1 14:24:21 2008 From: gary.holton at UAF.EDU (Gary Holton) Date: Tue, 1 Apr 2008 06:24:21 -0800 Subject: Last call for review of new metadata documents Message-ID: [re-posting ... my original posting bounced] Dear Gary and others, Sorry to weigh in at the eleventh hour. This is a very interesting discussion and has certainly helped to clarify several issues for me. I want to comment specifically on Jeff's proposal to delete the line "A metadata repository should not degrade the 'signal-to-noise ratio' for language resource discovery." That sentence clarifies a crucial issue. Too much detail impedes rather aides search. Part of the problem may be that from an archive point of view, the library catalog is not the appropriate model. While I won't propose deleting reference to library catalogs in the OLAC documents, I think it is important to understand this distinction. A library strives to catalog every physical item, so that even a 2-volume book will show up as two entries in the catalog, even though it is logically one "work". A traditional archive catalogs collections, which may contain multiple individual documents. The metadata for each collection is in the form of a finding aid, generally a prose document which describes the items in the collection. The collection is also catalogued physically for the purposes of physically locating items (eg., shelf 12, box 2, folder 5), but descriptive metadata are not assigned at the box and folder level (though reference to specific folders may be made in the finderlist prose). I think the essence of the granularity discussion is that we want OLAC metadata to be more archive-like than library-like. In a sense, this resolves the hasTranscript discussion as well, because the transcript will generally be a part of the same collection as the resource which it is related to. In general, I find relations such as hasTranscript to be of limited usefulness because they rely on uniform coding by the cataloger, which is unlikely to happen. Any serious researcher wishing to look for transcripts will need to consult the entire collection rather than rely on a cataloger to identify those transcripts. In order for the brave new world of linguistics (as envisioned, for example, in Gary's plenary at EMELD 2007) to proceed we have to ensure that researchers point back to the original collection rather just mine the bit their browsers point to. Promoting coarse granularity will help to achieve that. Gary