fproposed revision of format.sourcecode

Steven Bird sb at UNAGI.CIS.UPENN.EDU
Mon Sep 23 06:38:54 UTC 2002


Baden Hughes <baden at compuling.net> wrote:
> After a survey of several language archives, I'd like to propose some
> possible changes to the format.sourceode schema. Essentially this list
> is a list of programming languages of various types, in which software
> may be written. This list includes those found at:
> http://www.hypernews.org/HyperNews/get/computing/lang-list.html
>
> A draft can be found online at:
> http://www.compuling.net/projects/olac/220902-draft-olac-format.sourcecode.xsd
>
> Comments welcome.

This is great - a 20-fold increase on the number listed in my original 0.4
list.  I grepped for a few obscure languages and they were all there.

I'd like to raise two low-level technical issues, capitalization and
whitespace.

First, 99% of the codes are all-caps, even though some programming language
names are not written like this (e.g. the list gives "PROLOG" but it should
really be "Prolog").  However, rather than having to settle disputes about
this question, I'd prefer it if we case-normalized everything.  What do
people think - should we standardize on uppercase?

Second, Baden's list includes many items with spaces, e.g. "OBJECTIVE
CAML".  However, it seems desirable to limit the range of characters that
can appear in a controlled vocabulary item (e.g. no accents) so that there
is no transmission problems etc.  In some contexts, such as hand-crafted
CGI Get requests and HTML anchors, it is a pain to have to manually escape
the space character.  Could we live with a restriction of no spaces -
i.e. replacing spaces with underscore?

** Note that neither of these issues is substantive, since each controlled
vocabulary item will be associated with a human readable form (including
translations into other languages).  For example, in Dublin Core, there is
a refinement named "hasVersion" with the human-readable label "Has
Version".  [http://www.dublincore.org/documents/dcmes-qualifiers/].
The plan is to do the same thing for OLAC vocabularies.

-Steven



More information about the Olac-implementers mailing list