[Corpora-List] Sorting upper-ASCII chars in Unix

Vlado Keselj vlado at cs.dal.ca
Mon Nov 24 20:27:41 UTC 2003


On Mon, 24 Nov 2003, William Fletcher wrote:

> A recent query elicited numerous responses from Unix gurus.  Perrhaps
> one of them can help me with a question that has our Unix people
> stumped.
>
> I have been trying to use the Unix sort function to sort files which
> contain upper-ASCII characters (i.e. ASCII code > 127) on a machine with
> locale, language and charset set to US English.  Lower-ASCII characters
> and some upper-ASCII characters sort fine, but some upper-ASCII
> characters (specifically some non-alphanumeric ones) are left in
> semi-random order.
>
> How should the relevant environmental variables be set to permit sorting
> files in straight ASCII order?


This can cause a lot of frustration, indeed.

The following variables may effect sorting:
LANG, LANGUAGE, NLSPATH, LOCPATH, LC_ALL, LC_MESSAGES

I believe that setting: LC_ALL=POSIX
solves the problem.

Vlado



>
> Thanks in advance,
> Bill Fletcher
>



More information about the Corpora mailing list