Reminder: Call for review of new metadata documents

Helen Aristar-Dry hdry at LINGUISTLIST.ORG
Wed Mar 26 13:12:18 UTC 2008


Hello, Gary (and all),

I just wanted to second both Jeff's points.  I realize that I have 
always assumed that OLAC metadata is designed to facilitate resource 
discovery, not full resource description (which might be left to the 
IMDI metadata set, or another more elaborated standard).  I recall 
mentions of the fact that a researcher typically wants to find anything 
written on an endangered language, so just knowing the language code of 
some resources may be adequate.  And other discussions seem to have 
assumed that an archive will most likely not use OCAC metadata as its 
primary metadata set, but rather export a subset of its descriptive and 
  technical and administrative metadata in OLAC format.  It seems to me 
we routinely talk of OLAC as though its primary purpose is resource 
discovery.  This is a perfectly justifiable and reasonable mission, as 
Jeff notes below.  It has the advantage of (a) being doable and (b) 
filling a niche.  But I do think the mission statement should reflect 
it.  Such clarity would be helpful to those of us who routinely try to 
promote OLAC.

Even within the context of resource discovery, however, 'hasTranscript' 
would seem to be an important descriptor.  In a typical linguist's 
collection, where nine-tenths of the recordings have not been 
transcribed, 'hasTranscript' would distinguish those that another 
researcher would most want to find.  I can understand your not wanting 
to do a lot of work to produce a temporary solution, of course.  But 
this is something that has been frequently requested, so maybe OLAC 
could put it on some 'must-do' list.

And thank you for all the work you and Joan and Steven are doing.  I 
think the OLAC Users' Guide is a very helpful and well-conceived document.

All the best from snowy Michigan.
-Helen

Jeff Good wrote:
> Dear Gary,
> 
> Thanks for the clarification regarding the Relation element. It's too 
> bad we're stuck waiting for DC to finish its process. Would it make 
> sense for us to start the document process for this refinement before 
> they officially release the new schema for qualified Dublin Core? Then 
> we could take advantage of it quickly once it's official.
> 
> I have one other question about the new documents at this point, again 
> regarding granularity. The new discussion I think is quite welcome and 
> sufficiently detailed and clear to be put to use. I still find that a 
> bit of context is missing though. The background assumption seems to be 
> that OLAC metadata is intended for certain kinds of search (I don't know 
> of a good way to define those kinds of search other than to say they are 
> approximately Google-like). This certainly has been an assumption 
> driving OLAC for quite some time. The problem, as I see it, is that 
> nowhere in the OLAC documents (that I'm aware of) is this assumption 
> explicitly laid out.
> 
> Perhaps I'm the only one who reads these things, but the Mission 
> statement (pasted here) doesn't even explicitly talk about search at all:
> 
> "OLAC, the Open Language Archives Community, is an international 
> partnership of institutions and individuals who are creating a worldwide 
> virtual library of language resources by: (i) developing consensus on 
> best current practice for the digital archiving of language resources, 
> and (ii) developing a network of interoperating repositories and 
> services for housing and accessing such resources."
> 
> Since the current granularity recommendations are only indirectly 
> connected to the Mission, it would be nice if the relevant rationale for 
> them were given. In fact, to be honest, I'm not sure what the rationale 
> is precisely since I can imagine two fairly distinct ones: (i) that 
> OLAC's mission has changed and its primary focus is to serve as a bridge 
> between linguistic repositories and digital library initiatives like OAI 
> (an excellent mission, if more limited than the current one) or (ii) 
> that OLAC has determined that the most useful step it can make towards 
> its ultimate mission at present is to facilitate language resource 
> discovery in an OAI context.
> 
> While clarifying this issue is perhaps not all that important to move 
> forward with current work, it obviously could be pretty important down 
> the road, in particular as search technologies and our ideas about what 
> we want to search for and how we want to do it change.
> 
> Jeff
> 
> 
> 
> 
> On Mar 24, 2008, at 10:16 PM, Gary Simons wrote:
> 
>> Dear implementers,
>>
>> This is a reminder that we have one week left in the review period for 
>> the
>> documents listed in the attached message.    We are anxiously awaiting 
>> your
>> feedback!
>>
>> So far we have gotten just one comment, namely, from Jeff Good asking 
>> about
>> the possibility of using a solution like the following for isTranscriptOf
>> and hasTranscript:
>>
>> <dc:relation xsi:type="olac:lingrelations" olac:code="isTranscriptOf">
>>
>> Such a solution would be possible, but since isTranscriptOf is 
>> analogous to
>> isVersionOf (and the other refinements of dc:relation), it really 
>> should be
>> a new element (in the olac namespace) that is defined as a refinement of
>> dc:relation, which would also enable it to take the encoding schemes that
>> dc:relations take, e.g.
>>
>> <olac:isTranscriptOf xsi:type="dcterms:URI">
>>
>> This "proper" solution takes us beyond conformance to the current XML
>> schema for qualified Dublin Core, so our thinking is that we don't 
>> want to
>> implement a change like that, but rather wait for the revision of the XML
>> schema for qualified DC (due out this year) that will support such
>> extensions. We are also not keen to go to all the work of defining and
>> implementing the olac:lingrelations extension (which includes writing a
>> document and putting it through the stages of the review process) for a
>> short-lived temporary solution. Thus, we have these new refinements on 
>> the
>> list of changes for version 2.0 of our metadata format.
>>
>> -Gary
>>
>>
>>
>>
>>
>>            Gary Simons
>>            <gary_simons at SIL.
>>            ORG>                                                       To
>>            Sent by: OLAC             OLAC-IMPLEMENTERS at LISTSERV.LINGUIST
>>            Implementers List         LIST.ORG
>>            <OLAC-IMPLEMENTER                                          cc
>>            S at LISTSERV.LINGUI
>>            STLIST.ORG>                                           Subject
>>                                      Call for review of new metadata
>>                                      documents
>>            03/05/2008 10:35
>>            PM
>>
>>
>>            Please respond to
>>              Open Language
>>                Archives
>>                Community
>>            Implementers List
>>            <OLAC-IMPLEMENTER
>>            S at LISTSERV.LINGUI
>>               STLIST.ORG>
>>
>>
>>
>>
>>
>>
>> Dear implementers,
>>
>> Many of you also subscribe to the OLAC-GENERAL list and so have gotten 
>> the
>> general announcement about this call for review for new metadata 
>> documents.
>> Those of you who have implemented an OLAC data provider are directly
>> affected since this new work focuses on ways of improving the quality of
>> the
>> metadata in our implementations.  In this message we repeat the general
>> announcement for the benefit of those not subscribed to OLAC-GENERAL, and
>> then we supply further information that is relevant to you as 
>> implementers.
>>
>> Six months ago the US National Science Foundation awarded funding for a
>> project named "OLAC: Accessing the World's Language Resources" which aims
>> to
>> greatly improve access to language resources for linguists and the 
>> broader
>> communities of interest. If you are interested in learning more about the
>> project, you may visit the project home page at:
>>
>>  http://olac.wiki.sourceforge.net/
>>
>> In the first phase of the project we are focusing on improving metadata
>> quality as a prerequisite to improving the quality of search.  To that 
>> end
>> we have drafted some new documents that can serve as a basis for 
>> improving
>> and measuring metadata quality within our community:
>>
>>  Best Practice Recommendations for Language Resource Description
>>  http://www.language-archives.org/REC/bpr.html
>>
>>  OLAC Metadata Usage Guidelines
>>  http://www.language-archives.org/NOTE/usage.html
>>
>>  OLAC Metadata Quality Metrics
>>  http://www.language-archives.org/NOTE/metrics.html
>>
>> These documents have been reviewed in Draft status by the Metadata 
>> Working
>> Group. After significant revision, they are now promoted to Proposed 
>> status
>> and are thus ready for review by the entire community. In keeping with 
>> the
>> OLAC Process standard, we hereby make a formal call for review. The 
>> review
>> period will end on MARCH 31, at which point all of the comments that have
>> been received will be processed to create revised versions of the
>> documents.
>> You may submit comments by simply replying to this message. <End of 
>> general
>> announcement>
>>
>> The OLAC Metadata Standard that you followed in implementing your
>> repository
>> defines the constraints on validity for a metadata record, but it 
>> gives no
>> advice about what a high quality metadata record is like. The first two
>> documents listed above address this issue.  Then, in keeping with the 
>> OLAC
>> core value of "Peer Review", we want to implement a service that will
>> measure conformance to the recommendations that can be automatically 
>> tested
>> for. That is the issue addressed by the third document listed above.
>>
>> We have implemented the proposed Metadata Quality Score so that you 
>> can see
>> the implications for your current metadata. (As the documents are revised
>> to
>> express community consensus, the implementation of the metrics will be
>> updated to match.) The metadata quality analysis as currently implemented
>> is
>> accessible from a test version of the Participating Archives page. The 
>> site
>> has no links to this test page; it is accessed by entering this URL in a
>> browser:
>>
>>  http://www.language-archives.org/archives-new.php
>>
>> Follow the "Sample Record" link for your archive to see the quality score
>> for the sample record named in your Identify response, along with 
>> comments
>> on what can be done to improve the score. Follow the "Metrics" link to 
>> see
>> the average quality score for the records you are currently providing.
>> Kudos to the Audio Archive of Linguistic Fieldwork (Berkeley), Centre de
>> Ressources pour la Description de l'Oral (CRDO), and the CHILDES Data
>> Repository who are already getting scores around 8 or higher.  The 
>> rest of
>> us have room for significant improvement!
>>
>> Eventually, this new Participating Archives page will replace the one 
>> that
>> is currently accessed from the ARCHIVES link in the OLAC site banner.
>> However, this will not happen right away. After the current round of 
>> review
>> and any subsequent revisions, the documents will be put to the OLAC
>> Council,
>> who will check the revised documents and promote them to Candidate status
>> when they feel they are ready. Next we will issue a call for 
>> implementation
>> and give at least one month for implementer feedback. Based on that
>> feedback, final revisions will be made to the satisfaction of the Council
>> who will then grant Adopted status.  The new Participating Archives page
>> will not replace the current one until the new guidelines and metrics are
>> adopted.
>>
>> This discussion of process is to let you know that you will probably want
>> to
>> plan to update the implementation of your metadata repository some time
>> within the next few months. When these new metadata recommendations and
>> usage guidelines are officially adopted, the public will be able to 
>> see the
>> metrics scores for your repository. In the meantime, it is just other
>> implementers who are seeing them. You need not wait until the Candidate
>> call
>> for implementation to begin implementing changes.  As soon as your 
>> updated
>> repository is harvested, you will see the metrics change.
>>
>> Again, the review period will end on MARCH 31, at which point all of the
>> comments that have been received will be processed to create revised
>> versions of the documents. You may submit comments by replying to the 
>> list
>> (and potentially entering into discussion with other implementers) or by
>> mailing them to <olac_project at gial.edu>. That account is handled by 
>> Debbie
>> Chang, a Masters candidate at the Graduate Institute of Applied 
>> Linguistics
>> who is the Research Assistant for our project.  She will compile a 
>> list of
>> all the comments (whether submitted to the list or to the project 
>> account),
>> which the document editors will then be asked to respond to. That 
>> response
>> will come after the review period closes.
>>
>> With a solid foundation based on quality metadata, our grant project will
>> be
>> able to build improved search services and to expand coverage by 
>> attracting
>> more participating archives and by implementing gateways to other
>> aggregators.  We are grateful for your participation in this venture and
>> trust that you share our excitement about its potential.
>>
>> Best wishes,
>> Gary & Steven
>>
>> _______
>> Steven Bird, University of Melbourne and University of Pennsylvania
>> Gary Simons, SIL International and GIAL
>> OLAC Coordinators (www.language-archives.org)
>>
>>

-- 
Helen Aristar-Dry
Professor of Linguistics
Director, Institute for Language Information and Technology (ILIT)
Eastern Michigan University
2000 Huron River Rd., Suite 104
Ypsilanti, MI 48197

734.487.0144 (ILIT office)
734.487.7952 (faculty office)
734.482.0132 (fax)
hdry at linguistlist.org



More information about the Olac-implementers mailing list