[Lexicog] RE: Query re keyboard re-mapping

Peter Kirk peterkirk at QAYA.ORG
Fri Sep 17 22:39:53 UTC 2004


On 17/09/2004 20:27, Benjamin J Barrett wrote:

> This is fairly off-topic, but it seems I'm not the only one on this
> list frustrated with this.
>
> I decided that Rudy's forwarded solution is probably the only recourse
> for getting superscripts. What would be nice is to have a keyboard
> ready to go, though. With this work-around, the keyboard only "works"
> in Microsoft Word.
>
> I suppose that's only natural, though, since superscripting is a
> formatting issue. I went to the Unicode page and they have a FAQ
> telling people not to request modified letters. They have a variety of
> diacritics that have a built-in backspace so they appear above the
> letter typed after them.


Well, the question is whether this is really a formatting issue. It is
certainly sometimes more than that. See below.

>
> That's as far as I have pursued this issue, but this means that it's
> impossible to type in Makah in a Weblog or any other of a variety of
> interfaces because it requires a superscript "w". Yes, you can add
> HTML tags in an HTML-compliant interface, so I guess you can program a
> keyboard to handle that, but then two "w" keys are needed, one for
> each interface, but that really just avoids the issue that Unicode
> doesn't directly support language X that requires superscripts.
>
> My thought is to approach the Unicode committee with the problem, but
> I would like to present the issue in a way that is informed with their
> issues and that makes the case clear.
>
> To me, the issue seems clear: If Unicode does not have a superscripted
> "w" as a character, there is no way for me to write Makah in e-mail,
> Excel, Yahoo!Chat or most of the other software. If Unicode does have
> it, users need only get an updated version of the Unicode font for
> their system to be able to display it correctly. The downside of this
> is perhaps a need to greatly expand the number of Unicode characters
> to support all the languages with such issues, but given the myriads
> of Chinese characters already included, that doesn't seem like THAT
> big of a deal.


The problem with what you write here is that Unicode *does* have a
superscripted w character, U+02B7, and indeed most other superscripted
letters. These characters are scattered around rather. Some of them are
in the Spacing Modifier Letters block,
http://www.unicode.org/charts/PDF/U02B0.pdf. More are in the Phonetic
Extensions block, http://www.unicode.org/charts/PDF/U1D00.pdf. A few,
mostly numerals, are in the Superscripts and Subscripts block,
http://www.unicode.org/charts/PDF/U2070.pdf. And some are in the Latin-1
Supplement, http://www.unicode.org/charts/PDF/U0080.pdf. I think between
these different places there is a full superscript basic Latin alphabet
plus several special forms and some Greek letters. See also
http://www.unicode.org/versions/Unicode4.0.0/ch14.pdf, p.12 of this file
which is numbered p.359.

If there are any other superscripts which are needed to make a semantic
distinction (rather than just formatting) in any language and script,
and you can prove this, Unicode will add them.

Of course this doesn't solve the keyboarding issue. But Tavultesoft
Keyman and MS Keyboard Layout Creator will both allow you to assign keys
to any of these characters. And you need a font, of course, but there
are a number available which support these superscripts. For example,
Gentium (a free download from SIL) supports superscript w, but not all
of the superscripts. Code2000 should support all of them.

Note that the result is different from what happens with Ctrl-Shift-+ in
Word. This key sequence gives you a superscripted (i.e. reduced size and
raised) version of the same character. A proper Unicode keyboard gives
you what you should really have where there is a semantic distinction
(because the distinction is maintained in plain text), the proper
superscript character.

>
> The other solution is to provide a behind-the-scenes
> superscripting-and-letter-shrinking code that makes any letter smaller
> in size and superscripted. I don't think that is practical, though,
> due to the way computers work.
>
> With Longhorns (Windows 2006) well in the works, it might be too late,
> but it seems this is a critical issue for continuing the widening
> compatability of Microsoft's software with various languages. Again, I
> would like to make a strong case to Microsoft. Along with this is the
> need to change the two-letter language abbreviations used in HTML and
> Microsoft's language bars to a three-letter abbreviation such as the
> Ethnologue uses. With only two letters, we'll run out of possible
> languages after 625 languages.


I know there is work in progress on this. In fact I think there is
already an international standard in progress for three letter language
identifiers or similar. But I don't know what will be in Longhorn.

>
> Does anyone have experience in approaching these bodies?


There are people within SIL (of which I used to be a member) working on
these script identifiers. I can put you in touch if you like. I have my
own experience of working with the Unicode Consortium.

>
> Benjamin Barrett
> Baking the World a Better Place (with the famous dog Pasco)
> www.hiroki.us <http://www.hiroki.us>



--
Peter Kirk
peter at qaya.org (personal)
peterkirk at qaya.org (work)
http://www.qaya.org/




------------------------ Yahoo! Groups Sponsor --------------------~-->
$9.95 domain names from Yahoo!. Register anything.
http://us.click.yahoo.com/J8kdrA/y20IAA/yQLSAA/HKE4lB/TM
--------------------------------------------------------------------~->


Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
    lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list