[Corpora-List] Customisable Concordancer for ESP (Law) Learners?

Ulrik Petersen ulrikp at hum.aau.dk
Thu Jan 25 12:30:36 UTC 2007


Dr. Bruce,

In reference to your query on Corpora-List, please take a look at my 
Emdros search engine:

http://emdros.org

It is a generic corpus query tool with a powerful query language that 
would both easily and speedily search a 1-million word corpus by 3 words 
to the left or to the right.  I say "search" not "sort" because Emdros 
is really a search engine _on top of which_ you can then build almost 
any database-needing linguistic tool.

Emdros is Open Source, thus your requirement number 2 is met.

I have built a linguistic tool that is generic a query interface for 
Emdros.  You might want to try and download Emdros, then try out the 
Query Tool to see if it is anything like what you need.  If you find 
that it "almost" suits your need, then we might be able to extend it to 
do exactly what you need.

I would be able to help you in the development of a tool like the one 
you describe, either by writing code myself, or by answering questions 
about building applications on top of Emdros.

I would recommend, if you decide to use Emdros, and if the Query Tool is 
not to your liking, to use a scripting language like Python to build a 
nice GUI on top of Emdros. 

Emdros comes with language bindings for Java, Python, Ruby, Perl, and 
PHP.  Others can be added as SWIG permits (www.swig.org).

The Query Tool could easily be extended to have a concordance view 
rather than the running-text-with-brackets view that it currently has.

If you find that you need commercial support, my company, Emergence 
Consult, can also take care of that:

http://emergence.dk

Finally, there is an effort at a University in Toulouse, France, to 
build a concordancer which uses Emdros as the underlying database 
technology.  The concordancer has already been built and has been 
deployed among linguists who use it with great success.  However, the 
source code has not yet been published, and so it is, for the moment, 
closed to the world.  I might be able to negotiate with the Toulouse 
people about Open Sourcing their work.

Thanks for your time.

Best regards,

Ulrik Petersen
University of Aalborg, Denmark
http://www.hum.aau.dk/~ulrikp/




nigel bruce wrote:
> I am building and refining a corpus and Vocabulary research Tools for 
> an English for Law programme in an ESL-medium university.
> We have a basic set of tools that will concordance & do frequency 
> analysis, differentiating general from academic/legal terms.
> We want to go to the next level and develop a more powerful set of 
> tools - and our students are the researchers targeted to use the tools 
> & database.
>
> I wonder if anyone can recommend a Concordancer that
> 1. is powerful enough to speedily sort a 1 million-word corpus by 3 
> words to the left or right (our main new requirement)
> 2. allows the user to customise the interface, add features, and 
> select/delete unwanted features
> Expense is not an issue if it can permit the above.  If anyone can 
> propose search functions that they have found especially useful for 
> learners, I'm very interested in hearing new ideas.
>
> I work with ESL Law students and legal corpora - if anyone has 
> experience of using corpora in ESL higher ed., I'd like to exchange 
> ideas and experiences.
>
> Nigel Bruce
>
>
> _________________________________________
>
> Nigel Bruce
> English Centre
> 7/F, K.K. Leung Bdg.
> University of Hong Kong,
> Pokfulam Road,
> HONG KONG
>
> E-mail: njbruce at hku.hk
> http://ec.hku.hk/njbruce/
> Office Tel.: (852) 2859.2023;  Fax: (852) 2547.3409
>
>
>



More information about the Corpora mailing list