[Corpora-List] workstation advice for corpus linguistics work

Trevor Jenkins trevor.jenkins at suneidesis.com
Mon Jan 17 23:32:24 UTC 2011


Hi Don,

> I don't know a great deal about these particular software packages or
> what you will be trying to do with them.

Justin Washtell has covered the hardware options for you. RAM and lots of
it. 64-bit operating system environment with a Unix flavour or with a full
Windows version (not "Home" or "Starter" options). Though I would add lots
of disk space too.

I'd suggest that you open this up a little. Your original comment was

> I'm looking at Dell workstations.

but are you contractually obliged to use Dell? Personally I'm a Mac user
and that full square puts me into the Unix flavour, with OS X 10.5
(Tiger), 10.6 (Snow Leopard), and the forthcoming 10.7 (Lion) Mac OS is a
64-bit system. And the operating system level is efectively the Linux
rival FreeBSD with some tweaks by Apple.

> These are the software that I will be using and operating systems that I
> am thinking I will need:

Well I've added another one to your list with Apple. Or course that would
also dictate your hardware choice. But then I'm a Mac addict.

I've re-ordered your software list a little.

> R (e.g., for multiple runs of Fisher’s exact test)
> Linux
> Perl programs (multiple text manipulation programs)

All examples of open source software.

I'd also add python to the list so you could make use of NLTK (Natural
Language Tool Kit).

>  Word
> Windows
> Excel
> Access

Personally I'd spend my money on getting the best hardware than on
proprietary/expensive software. Continuing the theme of open source source
the OpenOffice.org package is functionally equivalent to the combination
of Word, Excel, and Access all for the rock-bottom price of some download
bandwidth. At least for Word and Excel there is file format compatability
--- indeed at one time the only compatability between particular versions
of Microsoft Office was to utilise OpenOffice.org!

> Perhaps other SQL applications

Mac OS X (and also various Linux distributions) ships with, at least, one
of the major open source SQL database systems installed. They may be on
additional software discs but in any event MySQL, Postgress, SQLlite and
others are only the cost of one-of download bandwidth away.

You might want to consider CWB (Corpus WorkBench) too. Open source as
well and requires more than Access's somewhat limited features  to
function.

Think in terms of function not product while you're planning this
workstation.

And don't forget support. You will get faster and less tortuous access to
support using open source products than the expensive ones where all
reference has to be made to the hardware manufacturer and they ain't got
the first notion about corpus work. Also your open source support is going
to cost you less --- put that part of the budget into the hardware too.

Regards, Trevor

<>< Re: deemed!



_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list