50% Spanish or German, 50% Chinese

Eduard Selleslagh edsel at glo.be
Wed Jul 7 10:35:36 UTC 1999


[snip]

>At the risk of beating an undead horse, i would very much like it
>reiterated that not all of us have software capable of recognizing the
>means by which some lucky subscribers' software can encode diacritics
>and other expanded character sets.  And worse, some of us have software
>that recognizes such codes, but interprets them quite differently from
>the way they are intended.

[Ed Selleslagh]

I know of no software (e.g. Windows, text processors, browsers, e-mail
programs, etc.) that are incapable of handling the full 256 characters
(using 8 bits) of code page 437 or 850, unless you are using a '70's IBM
mainframe.

[ Moderator's comment:
  Try a 1990's XKL mainframe.  One that uses 7-bit ASCII natively, and barfs
  on anything else.  Like the one running this mailing list.
  --rma ]

It is simply a matter of settings which you can change easily. Unless... See
below.

>For instance, the software we have here at SCU automatically converts
>all such codes into representations of various Chinese characters.  The
>result is that i have recently received many postings including extended
>and richly-exemplified discussions of Spanish vocabulary, and most of
>the words in question have been printed half in Spanish and half in
>Chinese, to the point that i have been unable to make any sense out of
>the posting at all, and have regretfully decided i must automatically
>dump & ignore the whole discussion.

[Ed]

In the English speaking countries a number of servers, but far from all,
use only 7-bit encoding, i.e. only 128 characters.

In the case of 8-bit transmission, local (on your own PC) switching to the
right character set (code page) will solve all the problems (e.g. in
Windows: 'international settings'). You clearly use another code page in
which only the first 128 characters are recognized as classic ASCII. BUT:
languages like Chinese usually use 16-bit characters; I guess your PC is set
for a 16-bit (256 x 256 = 65,536 characters) code page, the first 128 of
which are reserved for the classic ASCII characters like in all non-Latin
code pages (e.g. Greek, Cyrillic, etc.).

Apparently, the *.xkl.com server correctly transmits 8-bit (256) codes,
which can locally (your PC) be recognized correctly using the right
settings, since I receive the diacritics correctly, using code page 850 (437
works equally well).

[ Moderator's comment:
  No, the server here transmits only 7-bit ASCII.  MIME-quoted-printable
  messages, which use only 7-bit ASCII, are translated at the *receiving* end
  into 8-bit (which I cannot read with any mail program available to me, to
  check on this).

  Further, you are assuming a Windows system, with your discussion of "code
  pages" and the like.  Neither Macintosh nor Unix systems agree with Windows
  on 8-bit conventions.  Therefore, it is unfriendly to use 8-bit characters
  in mailing list messages.
  --rma ]

Of course, there may be some people on this list that depend on a local
(somewhat old-fashioned) server that only transmits 7 bits, in which case
there is no solution to the problem.

[ Moderator's comment:
  Your moderator, for instance.
  --rma ]

It is not obvious that setting the right code page would eliminate your
'Chinese problem' because of the 16-bit setting, which creates a problem on
a deeper level than that among European-type PC's (actually their software).

>I've recently begun noticing similar problems with postings in German.

[Ed]

The solution is the same for all Western European languages (code page 437
or 850).

[ moderator snip ]

>[ Moderator's comment:
>  I have pointed this out in the past, and been roundly excoriated for my
>  point of view, taken somehow to be "English-only".  I will once again
>  suggest that we adopt a modified TeX-like accent-writing system, in which
>  the accent (in the typographical sense, which includes umlaut/diaeresis/
>  trema and the like) is written next to the character affected.  (In TeX
>  systems, it must precede, but I think that context can disambiguate for
>  human readers.)  Should I send out a list of the TeX conventions, for those
>  unused to them?
>  --rma ]

[Ed]

The list would be very welcome.
However, there is a problem: on most non-US keyboards, either the diacritics
are on dead keys (How do you write a diacritic without a letter under it? I
tried the dead key plus Spacebar: it works) or the letters with diacritics
are normal keys (Maybe you have to use ALT+number codes?).

[ Moderator's response:
  The diacritics used in TeX are 7-bit ASCII characters.  I will post a list of
  them shortly.  Let's get caught up on the backlog first.
  --rma ]



More information about the Indo-european mailing list