[Lexicog] msort

William J Poser billposer at ALUM.MIT.EDU
Wed Mar 3 02:50:28 UTC 2004


I am the author of msort and can answer Wayne Leman's questions.
msort can parse Shoebox style files, allows arbitrary sort orders to be
specified for each key, allows essentially unlimited numbers of multigraphs
of arbitrary length, can handle optional keys, has a basic exclusion
facility (so that you can, for example, ignore leading hyphens),
and can reverse keys.

At the moment, there is no DOS or MS Windows version of msort.
Once upon a time there was a DOS version, but I haven't used it
in a long time and can no longer find the files on-line. If I look hard,
I might find an old diskette with them. I do have frozen set of
source files for the DOS version, so in theory I could dig out
the compiler and compile it again. In those days I used the
Power C compiler from a company called MIX, which apparently
is still around, or at least still has a web site.

However, the DOS version lacks some of the nicer, more recent
features.  More important, it was subject to fairly severe memory
limitations that made it useless for a lexicon beyond a certain size.
I don't recall offhand whether it was limited to 640K or whether it
could use "high" memory up to 1MB, but whatever it was, it was pretty small.

The current version was developed and tested under GNU/Linux
and is known to work on other varieties of Unix. It is written
in POSIX conformant C and does not make use of the window system
or anything else likely to be operating-system dependant.
As a result, I suspect that it would be possible to get msort to
compile and run  under MS Windows using a compiler that provided
a POSIX environment. I haven't done this yet because I don't use MS Windows
and don't have a C compiler for it, but if there is interest I would
be willing to have a go at it. I know some people who program under
MS Windows and could probably get access to a compiler and some advice.

However, it would probably be best to wait until I complete the next set
of changes to msort. The most important thing in the works is to make
it use Unicode. Right now it will happily work with 8-bit characters,
but it doesn't know Unicode. I plan in the near future to adapt it to
handle UTF-8 Unicode, which will make it much more useful.

msort is available for the Macintosh, if you are running OS X.
An OS X executable can be downloaded from my web site.
I haven't tested it on the Mac myself, but a friend compiled it
on his Mac and says it works though I don't think he has tested it
very rigorously.

For those interested, the msort manual, source code, and executables
for GNU/Linux on Intel hardware, Sun, and Mac OS X are available at:
http://www.cis.upenn.edu/~wjposer/software.html#msort.

Bill


--
Bill Poser, Linguistics, University of Pennsylvania
http://www.ling.upenn.edu/~wjposer/ billposer at alum.mit.edu



Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/lexicographylist/

<*> To unsubscribe from this group, send an email to:
     lexicographylist-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/



More information about the Lexicography mailing list