[Corpora-List] Runtime for tprep?

Kevin B. Cohen kevin.cohen at gmail.com
Fri Oct 19 16:30:35 UTC 2012


Does anyone have any intuitions about what the runtime for tprep
should be?  (tprep is the program that you run on a treebank to
preprocess it for the tgrep program, which allows you to search
treebanks.)  I'm running tprep on a fairly small corpus--about 419K
words.  It almost immediately gave me a message saying that it had
built a vocabulary file, and ever since then has just been sitting
there, chewing up most of my CPU for the past hour and fifteen
minutes.  This seems like a long runtime for 419K words, so I was
wondering if anyone out there remembers their experience with running
tprep and whether this sounds normal or not...  The support email
address in LDC's man page for tprep bounces...

Thanks,

Kev

-- 
Kevin Bretonnel Cohen, PhD
Biomedical Text Mining Group Lead, Computational Bioscience Program,
U. Colorado School of Medicine
303-916-2417 (cell) 303-377-9194 (home)
http://compbio.ucdenver.edu/Hunter_lab/Cohen

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list