[Corpora-List] workstation advice for corpus linguistics work
Michal Ptaszynski
ptaszynski at media.eng.hokudai.ac.jp
Tue Jan 18 13:16:44 UTC 2011
Dear Don
You might look for a configuration similar to the one as below.
1. RAM: 24GB or more. if you aim in processing speed and have a large
budget, you might order a higher class RAM than the usual crap they put
into PCs in stores.
2. hard disk, RAID: if you wish to do lots of queries to the corpus in a
short time I'd recommend SSD, for example 4x256GB SSD in RAID 0 (= 1 TB
SSD). However, since SSDs have a short durability, I'd also do frequent
copies on a traditional hard drive. I was told that a year is long if you
are a hard-core corpus analysis maniac. :)
3. CPU: depending on how much RAM you want to stuff your PC with, the CPU
and therefore motherboard will also differ. For example, it is said that
Intel's i7 processors swallow effectively not more than 24 RAM. If you
want more, you should choos Xeon, etc.
4. OS: Linux and Win both in x64. Also, I'd recommend using 64 bit
software, like Excel 2010. As for using Perl on 64bit machines, couple of
years ago there were still some problems with compiling, but they should
be resolved till now.
Best regards and good luck!
--
Michal PTASZYNSKI
Institute of Engineering, Hokkai-Gakuen University
High-Tech Research Center, Intelligent Techniques Laboratory 6,
Minami 26, Nishi 11, Chuo-ku, Sapporo, 064-0926, Japan
ptaszynski at hgu.jp, ptaszynski at ieee.org
TEL: +81-11-841-1161 (ext.: 7796), FAX: +81-11-551-2951
http://arakilab.media.eng.hokudai.ac.jp/~ptaszynski/
----------------------------
Od: Justin Washtell <lec3jrw at leeds.ac.uk>
Do: Donald E Hardy <donhardy at unr.edu>, "CORPORA at UIB.NO" <CORPORA at uib.no>
Data: Mon, 17 Jan 2011 21:17:32 +0000
Temat: Re: [Corpora-List] workstation advice for corpus linguistics work
Dear all,
I’m looking for advice on purchasing a workstation for corpus work.
These are the software that I will be using and operating systems that I
am thinking I will need:
R (e.g., for multiple runs of Fisher’s exact test)
Word
Windows
Linux
Perl programs (multiple text manipulation programs)
Excel
Access
Perhaps other SQL applications
XAIRA
ICECUP 3.1
I’m sure there will be other software packages added to the list.
Corpora include data gathered from Corpus of Contemporary American
English, Corpus of Historical American English, BNC, Treebank, ICE-GB,
Brown, Frown
I’m looking at Dell workstations.
Recommendations I’m looking for are operating system(s), CPU, RAM, Video
card, hard disk, RAID.
I am relatively computer literate (program in Perl, manage a server); and,
I do have expert technicians for help and advice locally. However, I
don’t have anyone locally for advice on the best system setup for corpus
linguistic work.
Thanks very much,
Don Hardy
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list