[Meta-discussion] TeX-style diacritics

Rich Alderson ALDERSON at mathom.xkl.com
Thu Jul 15 02:47:14 UTC 1999


Dear Readers:

Recently, a reader noted that he was having trouble reading posts containing
8-bit (non-ASCII) characters in Spanish and German, because his mail system
interpreted them as the first part of a 16-bit Chinese encoding.  I have, as an
experiment in usability, re-coded the portions of some posts which contained
8-bit characters to use TeX-style diacritics; I hope those have been useful to
those who do not use Windows systems to read this list.

  (Aside:  I had to play games with a Windows system to create a cheat sheet
  for myself to do this.  As I have noted in the past, the Indo-European and
  Nostratic lists are maintained and processed on an XKL Toad-1 system running
  Tops-20, which is aggressively 7-bit-only, and it will be a long while before
  I can take the time to learn a different system to move the lists.  I'd like
  to spend that time on the lists themselves, of course.)

I have long advocated the use of TeX-style diacritics in e-mail and in Usenet
newsgroup postings, because both of these modes of information transfer are
significantly unfriendly to 8-bit character encodings in many systems; further,
even 8-bit-friendly systems disagree in the placement of accented characters
within the character set (Windows differs from Macintosh differs from various
flavours of Unix differ from Windows ...).

TeX is a text formatting system which, with its extensive facilities, allows
the creation of document formatting systems of great richness.  (I personally
think that all linguists should learn and use it, for its wonderful capacity to
use multiple character sets organized into easily-accessed fonts, but this is
not the forum for that discussion; I mention it only as background.)

In standard TeX, accents are created by the following sequences of characters:

	\'	acute accent
	\`	grave accent
	\^	"hat"-style circumflex accent
	\~	tilde or "swing"-style circumflex accent
	\"	umlaut/trema/diaeresis
	\=	macron/overline
	\.	superior dot
	\b	underline/bar-under
	\c	cedilla
	\d	inferior dot
	\u	breve accent
	\v	h{\'a}\v{c}ek
	\H	Hungarian long umlaut
	\t	superior tie

I usually enclose accented letters in braces {} to organize them.  In TeX,
spaces after any command (like \' or \v) are ignored, so {\v c} is one way to
write <c-with-hac^ek> (to use another mode of writing certain accents); this
can be hard to read, even when one is a TeX user, so it is preferable to use
the alternate I used above, \v{c}.  Thus, <c-with-cedilla> could be written
\c{c}.

NB:  This is standard TeX; I would actually prefer to use \, for the cedilla
accent, although it has a different meaning in standard TeX.  Since anything
in TeX is ultimately user-definable, we can adopt that as an alternate to \c .

Finally, in TeX superscripts and subscripts respectively are indicated by
^{super} and _{sub}, where braces are not needed for a single character in this
context:  *k{^w}is vs. *ekwos is unambiguous as to meaning--the former has a
labiovelar initial, the latter a medial cluster.  The laryngeals can be written
as *H_1, *H_2, *H_3, or \'x, x, x{^w} (as was Cowgill's wont), or @_1, @_2, @_3
(using the "ASCII IPA" symbol <@> for shwa), or some other mode if you prefer.

It was suggested, by a TeX user, that we use an abbreviated set of accents; in
many cases this is fine, but for full generality I would like to encourage the
readers to use the TeX notation when necessary.

I will collect the responses to this message and summarize them, or create a
digest by hand, rather than filling the list with further non-Indo-European
topics.  Simple replies will keep the "[Meta-discussion]" subject header and
make it easy for me to do so.

								Rich Alderson

References for those who would like to know more about TeX:

Lamport, L.  _LaTeX:  A Document Preparation System_, 2nd edition, (1994:
Addison-Wesley, ISBN 0-201-52983-1) describes the most commonly used variant
of TeX.

Snow, W.  _TeX for the Beginner_ (1992: Addison-Wesley, ISBN 0-201-54799-6) is
a hands-on tutorial which concentrates on "Plain TeX", with references to major
differences from LaTeX.

Kopka, H.  _LaTeX: Eine Einfuehrung_ (1994: Addison-Wesley) is a 3-volume work
on the basic LaTeX system and the large number of extension packages which have
been created for it.  I don't own a copy and can't get the ISBN.

Goossens, M., Mittelbach, F., and Samarin, A.  _The LaTeX Companion_ (1994:
Addison-Wesley, ISBN 0-201-54199-8) describes a large number of extensions to
the LaTeX system.

Knuth, D.  _The TeXbook_ (1986:  Addison-Wesley, ISBN 0-201-13447-0) is the
description of "plain TeX" by the gentleman who created it.  Difficult and
dense, but worth looking through for the erudite quotations at chapter ends,
and the subtle jokes throughout.
-------



More information about the Indo-european mailing list