[Corpora-List] concordance program for large files

Yannick Versley versley at sfs.uni-tuebingen.de
Wed Sep 3 07:29:16 UTC 2008


Hi,

I found CQP (aka Corpus Workbench, see http://cwb.sourceforge.net/ ) quite 
useful in dealing with medium-to-large corpora - setting everything up 
(basically, you need to tokenize your corpus, POS tag it if you want, and 
then run cwb-encode and cwb-makeindex on it) takes a little time but if you 
also want to do more complicated queries it's usually worth it.

Best,
Yannick

> I was just wondering if you know of a good concordance program that deals
> with large files of over 1 million words that I might be able to use for my
> research. Has anyone had any experience with one? There are a few free ones
> on the internet, but they often don't deal with really large files.
>
> Regards,
> Jaime
>
> Mr Jaime Hunt MAppLing (TESOL), BA (Hons)
> PhD (Linguistics) Candidate
> School of Humanities and Social Science
> McMullin Building
> University of Newcastle
> Callaghan
> NSW 2308
> Australia
>
> Ph. +61 (0)2 4921 5175
> Email: jaime.hunt at studentmail.newcastle.edu.au
>
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- 
Yannick Versley
Seminar für Sprachwissenschaft, Abt. Computerlinguistik
Wilhelmstr. 19, 72074 Tübingen
Tel.: (07071) 29 77352

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list