updated OLAC Archive Report Cards

Gary Simons Gary_Simons at SIL.ORG
Wed Jun 30 02:25:08 UTC 2004


Baden,

Thanks for all your work on this report card system.  I think it is a great
idea and trust that it will serve to help all of us improve the quality of
the metadata we are publishing.

I'm the implementer for two archives, one of which ends up with a five-star
rating, while the other ends up with just two.  My intuition is that they
are not that different in quality, so I'm trying to understand the scoring
system to see what accounts for the huge difference. I've also reviewed our
archive report card with Joan Spanne, our archivist, to see what feedback
she might have.  She is actually responsible for many of the observations
in this note.

I'm looking at the documentation page and find that the explanation in "2.
Star Rating" isn't enough information to make it clear.  That says it is
based on the "average item score out of 10".  I'm not sure what that means.
A natural conclusion would be that each of the remaining outline points in
the document is an item, and the overall rating is based on the average of
those.  But I don't think that is what it means, since those don't describe
things scored on the basis of 0 to 10.

10 point scoring seems to appear only in "4. Metadata Quality", and that
section does talk about "items", so is it the case that the star rating
deals only with point 4 on metadata quality.  If so, the discussion in
point 2 should make this explicit.

If I'm on the right track that the items that are averaged are just those
in point 4, I'm still not completely clear on what constitutes an "item".
... Okay, as I look back and forth some more, I'm developing a new
hypothesis, namely, that "item" refers to the record returned for one
archive holding.  (Up to this point I was thinking it meant, "one of the
things on our checklist of quality points to check for".)  So, does that
mean, each harvested record from the archive is given a quality score from
0 to 10, and the stars are based on the average for all records?  That is
starting to sound more likely.

In that case, it still seems like the stars are based only on "4. Metadata
quality".  Now I think I'm understanding what the quality metric is, but I
want to make sure.  The first part of it is:

Code exists score =              ( Number of elements containing code
attributes ) / ( Number of elements in the record of type with associated
code )

Does this mean:

Code exists score =              ( Number of elements containing a code
attribute ) / ( Number of elements in the record that could have a code
attribute )

If so, then we could explore some actual cases and ask if we are getting a
reasonable answer.  For instance, if there were a record that contained
only one element and that was a <subject> element with a code from
OLAC-Language, would that mean a "code exists score" of 1.0?  It would be
missing 4 out of 5 core elements for a deduction of 10 * (1/5)(.8) = 1.6,
yielding a total score of 8.4.  If the archive contained thousands of
records, all of which had only a coded subject element, then the average
item score would be 8.4 for an overall rating of four stars.  Have I
understood the formulas correctly?  If so, then I think we will need to do
more work on figuring out the best way to compute the overall score.  In
this case a score that multiplies the percentage of core elements by the
percentage of code exists would yield 2 out of 10 which sounds like a more
appropriate score.

A fine point on "code exists" is what it does with non-OLAC refinements.
For instance, if a record contained only two elements and they were <type>
elements, one with the OLAC Linguistic Type and the other with the DCMI
Type, would that score as 1.0 or 0.5 on the code exists scale?  I looks to
me like it would be 0.5, which is half as good as the score of a record
consisting of only one coded <type> element, but in fact, the record with
two <type> elements is a better quality record.

The metric for "3. Archive Diversity" needs more thought.  It is defined
as:

Diversity = (Distinct code values / Number instances of element) * 100

The scores for diversity with respect to the Linguistic Type code are
illustrate the problem well.  An archive containing only one record which
is coded for one linguistic type would score 100%.  Whereas an archive
containing 1,000 records, all of which have a type element for the same
code would score 0.1%--but the one archive is not 1000 times are diverse as
the other.  I'm wondering if the formula shouldn't be:

Diversity = (Distinct code values / Total codes in vocabulary) * 100

Then every archive that has at least one instance of all three possible
values of Linguistic Type (regardless of how many records it has) would be
maximally diverse with respect to linguistic type.  I think that sounds
more correct.

Rather than Diversity, I wonder if the concepts of Breadth and Depth would
serve better.  That is, the Breadth of an archive (with respect to a
controlled vocabulary) would be the percentage of possible values that it
has.  It's Depth (with respect to that vocabulary) would be the average
number of records it has for each used code value.

On "7. Code usage", elements that may take a code are in focus.  I think it
should be the code sets (i.e. the extensions) themselves.  We presently
define five extensions, but some can occur with more than one element.  I
think there are a total of 7 element-extension combinations.  I think it is
those that should be analyzed here, not just the elements.  For instance,
<subject> can occur with Language and Linguistic Field.  Those should be
calculated as two separate entries in the chart.

That's all I have for now, but that is plenty to get the discussion
rolling.  Are you going to be at the EMELD conference?  If so, that might
be a great opportunity for some of us to gather at a whiteboard and thrash
out possible metrics.

Hope to see you there,
-Gary




                                                                       
                      Baden Hughes <badenh at CS.MU.OZ.AU>                
                      Sent by: OLAC Implementers List            To:   
                      <OLAC-IMPLEMENTERS at LISTSERV.LINGUI         OLAC-IMPLEMENTERS at LISTSERV.LINGUISTLIST.ORG
                      STLIST.ORG>                                cc:   
                                                                 Subject: updated OLAC Archive Report Cards
                                                                       
                      06/27/2004 07:14 PM                              
                      Please respond to Open Language                  
                      Archives Community Implementers                  
                      List                                             
                                                                       



Dear OLAC Implementers,

You may recall that in March, we announced a a new service which had
recently been added to the OLAC site, namely archive report cards.  These
give
summary statistics for each repository and an assessment of the quality of
the repository's metadata against both external best practice
recommendations and the relative use practices within the OLAC context.

An updated version of the archive report cards are now available - changes
include:

- updating evaluation algorithm to account for changes in DC
recommendations (eg use of
contributor rather than creator)
- updated labelling of graphs to be more consistent with OLAC terminology

The report cards can be accessed by clicking the "REPORT CARD" links on
the OLAC Archives page [1].  The report is also available for the full set
of repositories [2].  Information about how these reports are generated
is also available [3].  Reports are updated after every harvest, every 12
hours at the current point in time.

The evaluation metric rewards the use of OLAC extensions
(controlled vocabularies), and what we
consider to be core DC elements: title, date, subject, description, and
identifier.

The service was developed by Amol Kamat, Baden Hughes, and Steven Bird
at the University of Melbourne, with sponsorship from the Department of
Computer Science and Software Engineering.  We welcome your feedback.

Regards

Baden Hughes

[1] http://www.language-archives.org/archives.php4
[2]
http://www.language-archives.org/tools/reports/archiveReportCard.php?archive=all

[3] http://www.language-archives.org/tools/reports/ExplainReport.html




More information about the Olac-implementers mailing list