[Corpora-List] Corpora and SQL
Ulrik Petersen
ulrikp at hum.aau.dk
Thu May 24 16:52:18 UTC 2007
I agree with Dr. Kilgariff: "A dedicated corpus query tool does
everything well without extra engineering".
The original poster (and others) might want to check out how I have
implemented a complete corpus query system on top of SQL. I haven't
compared its feature-set with CWB, but it is comparable to TIGERSearch
in that it handles syntax very well. It also handles word-level queries
quite quickly.
http://emdros.org
This is part of my doctoral research, and interested parties can find a
number of publications about Emdros here:
http://www.hum.aau.dk/~ulrikp/publications.html
The website also has extensive documentation on how it is done, and it
is Open Source.
Thanks.
Best,
Ulrik Petersen
Adam Kilgarriff wrote:
> Of course a dedicated corpus query tool does everything well without extra
> engineering. When I see a discussion like this with lots of comments like
> "with a bit of effort", I think "how many person-hours do they mean? (And,
> how good a solution will it be?) Unless person-hours are very cheap, it will
> cost less to buy a service that already does what is wanted." But, seeing
> as I have such a service to sell, I'd better stop there or I shall be thrown
> off the list for being commercial
>
> Adam
> http://www.kilgarriff.co.uk
>
--
Ulrik Petersen, PhD candidate (comp.ling.), MA (comp.ling.), B.Sc.
(comp.sci. & math)
http://ulrikp.org -- Homepage
http://emdros.org -- Emdros is a corpus query system
More information about the Corpora
mailing list