Corpora: Using SARA to query other corpora than the BNC

Thomas Kuenneth tommi at linguistik.uni-erlangen.de
Mon Jun 25 09:11:16 UTC 2001


in response to Hans Martin Lehmann:

> It is based on the SARA server package with additional functionality
> implemented in perl and mysql.

Well, the (additional) use of an RDBMS shows that the approach we have developed
here at Erlangen may be just the right track :-)

As I have already pointed out, my basic assumumption is that relational
databases can be a very solid base for corpus data. If an apropriate set of
tables is defined. I hope (and think) Lou will share my assumption that
implementing the Sara server was FAR from being trivial, that a lot of design
decisions had to be taken into account and that a lot of problems had to be
solved. Database systems had to face such problems for almost 30 years now. And
the systems tnat are "out there" have a lot to offer in terms of performance,
efficient storage of data, stability and scalability. So why not taking
advantage of this wisdom, why implementing it all over again?

And that, to conclude with, is why I think that an RDBMS is the better approach
than any proprietary system. Now I know there may be some provoking aspects in
what I have said. And I am happy to discuss them, so please feel free to make
any comment. Please keep in mind however that during COMPLEX I may not be able
to answer, so my response again might take some time.

Regards
Thomas
---
Thomas Kuenneth M.A.           Universitaet Erlangen-Nuernberg
Institut fuer Germanistik         Abteilung Computerlinguistik
Bismarckstr. 6  *  D-91054 Erlangen  *  Tel.: +49 9131 8529250
http://www.linguistik.uni-erlangen.de/~tommi



More information about the Corpora mailing list