[Lexicog] RE: More on keyboard re-mapping and special characters (fwd)
Rudolph C Troike
rtroike at U.ARIZONA.EDU
Fri Sep 24 06:17:56 UTC 2004
I am forwarding here a lengthy and very helpful note from my friend at
Microsoft, Adam Dudsic (who works in the speech recognition area), which
addresses a number of the issues and concerns that have been raised
recently on the list. (In it he references some of the messages to the
list that I forwarded to him, to which he is responding, in addition to
responding to some needs I raised with him.) This gives some very useful
information and suggestions, and a specific avenue to contact Microsoft.
Rudy
---------- Forwarded message ----------
Date: Thu, 23 Sep 2004 13:05:05 -0700
From: Adam Dudsic <adudsic at microsoft.com>
To: Rudolph C Troike <rtroike at U.Arizona.EDU>
Subject: RE: More on keyboard re-mapping and special characters
Rudy,
I see several issues in the email you sent, and have tried to
satisfactorily address them.
ISSUE 1: Language codes. Benjamin points out a limitation with
two-letter language codes and asks why we don't use three-letter codes
in HTML and MS language bars.
HTML standards are determined by the W3C, so requests to change the HTML
specification for use of language codes would have to be submitted to
them. For reference, the specification for HTML 4.01 language codes is
here: http://www.w3.org/TR/html4/struct/dirlang.html#h-8.1.1.
In regards to the MS language bar, if I understand correctly, the
language IDs that are shown correspond to the Multilanguage User
Interface (MUI) packs that are installed on the system. As for how
those language codes are determined, this is another case where MS is
following the standards set out by an international body--the ISO. MS
guidelines direct us to use only ISO 639
(http://www.w3.org/WAI/ER/IG/ert/iso639.htm) and ISO3166 language and
country IDs, and that in combining these codes, we should follow RFC
3066. Although ISO specifies both three- and two-letter language codes,
RFC 3066, section 2.3 (Choice of language tag), rule 2 states:
"When a language has both an ISO 639-1 2-character code and an ISO 639-2
3-character code, you MUST use the tag derived from the ISO 639-1
2-character code." (http://rfc.net/rfc3066.html#s2.3)
A contact I have with the internationalization teams says that they are
well aware of the limitation imposed by the two-letter language
identifier, and are investigating the best course of action.
ISSUE 2: Resources for "exotic" languages. I gather there has been quite
a bit of frustration over the lack of support for "exotic" languages.
I'm guessing that the market for "exotic" languages is probably not
large enough to justify a business decision to develop Multilanguage
User Interfaces. I'm sure you're aware of costs in research and
development of the software code behind such resources, but the costs go
far beyond this. The commitment required in creating any piece of
software (especially given expectations people have for MS software)
includes usability testing, legal research, marketing, translation...and
an unbelievable amount of testing: Testing of component function,
testing of *every* user interface surface (windows, labels, buttons),
input hardware compatibility (keyboards, mice...etc.), application
compatibility, operating system compatibility, and daily code changes.
One of the test leads in my group estimates that for the two service
components in our product that his team tests, they run around 3,000
tests a day per platform. The test team also has weekly and monthly
tests. And even once a product is "done," there are still maintenance
and update costs.
In the end, it's a business. But in Microsoft's defense, I have to say
that after I arrived here, I have seen MS involved in a lot of projects
simply for the sake of community service. Upper management has made it
clear that community service is of great value to MS.
*I can't say it is likely,* but neither is it inconceivable that if a
group of linguists were to specify a font glyph set and keyboard mapping
that would serve a wide range of Native American language needs, someone
in the internationalization group might take an interest in the project
(to try this, get a large group of linguists to contact "Dr.
International"--see ISSUE 3).
Could such a glyph set be designed?--one that covers a wide range of
"exotic" languages?
SUGGESTION: The more complete a case you make, the more likely it is
that it will be considered. To do this, I suggest you write a
specification that addresses:
1. Goals of the project (concise summary of what you want to accomplish,
e.g., create a Unicode font and keyboard mapping that enables users to
type and read glyphs for the following twenty Native American
languages....)
1b. Non-goals of the project (limits--we want to go this far, but no
further, e.g., the goal is not to create a full MUI)
2. Target Audience (who will use the product, e.g., linguists,
students...)
3. Key Business Drivers (here is where you can appeal to community
service, preservation of languages, improvement of education and
research...etc.)
4. Scenario(s) (illustrate how the product would typically be used,
e.g., a language learning lab scenario, an Native American elementary
school scenario, a university research scenario...). Scenarios can help
developers make decisions about how to write code.
5. Functional Requirements (e.g., installer installs font and keyboard
map; tables of which keys map to which glyphs; specification of which
operating system the product should work on...other hardware/software
requirements...)
6. Deliverables (What will actually be produced/distributed/published?)
7. Resources (What can you all do, and what do you want MS to do?)
8. Risks (scheduling/resource dependencies)
9. Issues (What problems with the project still have no clear solutions
or require additional investigation? Will there be IP issues?)
Even if MS decides not to take on the project, with such a specification
in hand, you could easily apply for a grant to hire a developer or two
to work on the project.
ISSUE 3: Who to talk to at Microsoft. For globalization, localization,
localizability, or international feature usage problems, MS has a
resource ("Dr. International") whose job it is to find answers to the
kinds of questions the listserv is raising. Go to:
http://www.microsoft.com/globaldev/drintl/askdrintl.aspx and use the
email form to enter your contact information and a description of your
problem/question. It is Dr. International's job to find answers to the
kinds of questions the listserv is raising (including Longhorn
globalization/localization-related questions). I would encourage any of
the people on the listserv who want to pursue this issue with MS to
contact Dr. International. I think this is best and most direct way to
get your concerns to the right people and get accurate answers.
Dr. International actually has an official FAQ question regarding
keyboard customization. It's the first question on the page at:
http://www.microsoft.com/globaldev/DrIntl/columns/003/default.mspx.
ISSUE 4: Makah keyboard. Patrick Chew seems to have the right solution
here. I'm glad he named the keyboard mapping tool--I've since found it
and downloaded it. As long as you have a font that contains all of the
glyphs you need, you can create a keyboard map based on the glyphs in
that font. The Lucida Sans Unicode font contains the superscripted "w".
If Lucida Sans Unicode also contains all of the other Makah glyphs,
Patrick could map out a Makah keyboard using that font.
ISSUE 5: Custom keyboard for your Native American Indian languages. As I
mentioned, I downloaded the keyboard layout tool that Patrick mentioned.
It looks pretty straightforward. If: 1) All of the glyphs you require
are contained within a single font (e.g., Lucida Sans Unicode, or
perhaps an SIL font); and 2) You were to send me a list of the glyphs
that you would need for your keyboard layout...then I think I could help
you create a keyboard layout. If requirement 1 is not met, then you
would need to find someone who has font creation software who could make
you a custom font containing all of the required glyphs. Fontlab's
TypeTool (http://www.pyrus.com/Font-tools/TypeTool/) for $99 retail or
$49.50 academic looks like it would do the job.
To examine the Lucida Sans Unicode font to determine whether it contains
all of the glyphs you need:
1. Go to "Start", select "All Programs", select "Accessories", select
"System Tools", and click "Character Map".
2. Select "Lucida Sans Unicode" in the "Font" list box.
3. Optionally, enable the "Advanced view" check box, and in the "Group
by" list box, select "Unicode subrange". This displays the glyphs in
groups that I think are easier to scan.
Let me know if this looks like it would work for you.
Adam
PS Since I've included Microsoft information, I'm required to include
the standard MS disclaimer (I'd also appreciate if you included this
with anything you forwarded):
**********
This information is provided "AS IS" with no warranties, and confers no
rights.
**********
------------------------ Yahoo! Groups Sponsor --------------------~-->
Make a clean sweep of pop-up ads. Yahoo! Companion Toolbar.
Now with Pop-Up Blocker. Get it for free!
http://us.click.yahoo.com/L5YrjA/eSIIAA/yQLSAA/HKE4lB/TM
--------------------------------------------------------------------~->
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/lexicographylist/
<*> To unsubscribe from this group, send an email to:
lexicographylist-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the Lexicography
mailing list