Corpora: GATE version 2 released

Hamish Cunningham h.cunningham at dcs.shef.ac.uk
Fri Mar 15 14:30:22 UTC 2002


[You are receiving this mail because you have previously
 downloaded GATE, or are a member of a relevant list or
 organisation. Apologies for multiple copies.]

Release 2.0 of GATE, a General Architecture for Text Engineering,
is now available for download from

  http://gate.ac.uk/

GATE is an architecture, development and framework (or SDK) for
building systems that process human language. It has been in
development at the University of Sheffield since 1995, and has
been used for many R&D projects, including Information Extraction
in multiple languages and for multiple tasks and clients.

GATE is free software under the GNU library licence. Version 2
has been completely redeveloped in Java, and is a stable, robust,
and scalable infrastructure for Natural Language Processing, which
allows users to focus on NLP tasks, while mundane tasks like
data storage, format analysis, data visualisation are handled
by GATE. The new version has NLP components that will enable you
to reliably process documents, including Web documents supplied
as URLs, and obtain information such as the sentences they
contain, person names, organisations, etc., etc. This is based on
a set of reusable NLP components, which you can also use outside
GATE by embedding them into your own applications (e.g. a news
indexing service). GATE also provides standard tools for manual
annotation and performance evaluation, which are essential during
application development. GATE and its NLP components have been
successfully used in a large number of research projects and
commercial applications.

A summary of features:

- an architecture that describes NLP systems (including embedded
  systems) as components, and that defines a set of use cases for
  NLP infrastructure
- a framework, or class library, that implements the architecture
- a graphical development environment built on the framework
- re-taskable components (beans), inc. GUI components
- web-loaded components (HTTP, XML config)
- distributed data (JDBC)
- annotation model: "standoff markup", isomorphic with ATLAS,
  typing based on XSchema
- annotation differences viewer and automated measurement of
  accuracy
- XML I/O, XML system configuration Run-time interoperation with
  e.g. XSLT or X-PATH (via re-positioning info)
- JAPE, a pattern language for FST over annotation
- ANNIE, A Nearly-New Information Extraction system for English

To contact the GATE project mail
  gate-crashers at dcs.shef.ac.uk
or see the support page
  http://gate.ac.uk/support.html

Regards,

Dr. Hamish Cunningham
Department of Computer Science
University of Sheffield, UK
http://www.dcs.shef.ac.uk/~hamish



More information about the Corpora mailing list