[Lexicog] Sorting

Peter Kirk peterkirk at QAYA.ORG
Sun Mar 21 23:54:36 UTC 2004


On 21/03/2004 15:50, William J Poser wrote:

> No "standard" sort order, even with Unicode, is likely ever to
> be sufficient for lexicography and related work. Not only do we
> need to be able to do alphabetization for minority languages
> whose sort order is unique, but it is desirable
> to be able to sort in other ways for other purposes. Once equipped
> with a program that allowed me to specify completely arbitrary sort
> orders with essentially unlimited numbers of multigraphs of essentially
> unlimited length (the last time I checked, which was a long time ago,
> Shoebox limited multigraphs to four characters), I discovered uses
> for sorting that I hadn't previously thought of. For instance, I have
> generated topical indices automatically from the semantic field
> information. Sorting records by semantic field requires the use
> of "multigraphs" as long as "gathering-plants-scrapingcambium".
>
> Bill
>
The Unicode collation algorithm is designed to work with tailored
(customised) sorting weights and orders, as well as with the default
weights defined by Unicode. This should provide adequate flexibility for
unusual sort ordering in minority languages. It isn't designed for very
long "multigraphs"; I think the algorithm could cope with them but I
wouldn't be sure about implementations.

--
Peter Kirk
peter at qaya.org (personal)
peterkirk at qaya.org (work)
http://www.qaya.org/




Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list