For web page makers (fwd)

Koontz John E John.Koontz at colorado.edu
Mon Jun 19 16:11:03 UTC 2000


Here's something I sent aside in response to a query on using Unicode for
Web sites:

Unicode is missing precomposed combinations for things like vowel plus
nasal hook plus accent.  They only offer precomposed combinations with
diacritics for things that occur in (mostly) European(-originating)
languages.  Doing so was part of the compromise that got them together
with the European ISO committee.  Their original scheme, on which
linguists must still rely is to provide sequences of base character
symbols and diacritics symbols, I think with the diacritics preceding, but
I forget.  Unfortunately, most early implementations that I have seen
blythely ignore this, and one that didn't that I looked at produced
grossly inferior looking results for combinations, sort of like what WP
(for DOS or early Windows?) used to do, with spidery lines for the
diacritics.

----

The problem with a Unicode web site is the problem with any Web site that
uses other than the cross-section of standard Unix and Windows characters
that the Web standard recognizes.  People at sites without those
characters can't see anything.  For example, at a Unix site you can't see
some of the fairly innocent things (s-hacek?) that Wayne Leman uses in his
Cheyenne site, because those characters aren't available in the usual Unix
set (in the US).  What you see instead is a helpful blank.  (Incidentally,
WL is aware of this, but feels, reasonably, that most of his readers will
be using Windows systems.)

The problem is that though the poster of a Web pages gets to see it as
intended fully populated with local fonts, the receiver can only see it
that way of the receiver has all the same fonts.  Web pages are rendered
with the aid of browser-local fonts.

So, until all sites support Unicode, which means until all Unix and
Windows (and Mac. etc.) sites support Unicode in at least their Web
browsers, Unicode is not going to help at the receiving end.  Moreover,
for our purposes they have to support not just precomposition, but
composition of combinations by local rendering, so that when they see a
sequence ogonek acute a they render it as an accented a with a nasal hook.

---

There is an alternative, which I have been meaning to look at.  The road
to hell is paved with uncompleted projects.  This alternative is a scheme
put forward by Bitstream and Netscape to support downloadable fonts.
These fonts slow down the page, because they have to be downloaded, if not
present in the browser environment, but they do get downloaded and used if
they are missing.  They are secure, so they can't be used locally except
in conjunction with Web browsing.  They are supported natively by Netscape
browsers, and there is a plugin that gets automatically dowloaded into MS
Internet Explorer that supports them there.  The one glitch I know of so
far is that making this plugin downloadable requires the support and
cooperation of the people maintaining the Web server (not just the Web
pages), at the distributing site.

The other glitch (of sorts) is that the tool that makes the fonts from
regular TrueType fonts costs c. $200.00.  It is possible to download a
trial version of it that will make one or two fonts.  It has been my
intention to test this out on the Standard Siouan fonts, but I haven't
gotten around to it.

Of course, if you are, say, using a PC browser on a system that has the
Standard Siouan fonts installed, and browsing pages that are coded in
these fonts (among others) you should see the pages in Standard Siouan
characters.  At least this is the theory.  Jan has discovered some problem
combinations, though I don't remember the specifics at the moment.

The advantage of this approach over the approach of representing each
character as a gif file, which is what Shannon has set up, is that the
html files use a single character to represent a character, instead of
using a graphics file download instruction.  Also, though I haven't tried
and don't know the details, I'd think that the downloading might be
faster.

Note:  MS also has a downloadable fonts scheme, but it doesn't work with
Netscape, and may not work with all versions of MSIE, e.g., the Unix one,
though the Mac version of MSIE is also supposed to be somewhat
impoverished, feature-wise (incipient case suffix in English!).



More information about the Siouan mailing list