Corpora: GATE version 2 released
Hamish Cunningham
h.cunningham at dcs.shef.ac.uk
Fri Mar 15 14:30:22 UTC 2002
[You are receiving this mail because you have previously
downloaded GATE, or are a member of a relevant list or
organisation. Apologies for multiple copies.]
Release 2.0 of GATE, a General Architecture for Text Engineering,
is now available for download from
http://gate.ac.uk/
GATE is an architecture, development and framework (or SDK) for
building systems that process human language. It has been in
development at the University of Sheffield since 1995, and has
been used for many R&D projects, including Information Extraction
in multiple languages and for multiple tasks and clients.
GATE is free software under the GNU library licence. Version 2
has been completely redeveloped in Java, and is a stable, robust,
and scalable infrastructure for Natural Language Processing, which
allows users to focus on NLP tasks, while mundane tasks like
data storage, format analysis, data visualisation are handled
by GATE. The new version has NLP components that will enable you
to reliably process documents, including Web documents supplied
as URLs, and obtain information such as the sentences they
contain, person names, organisations, etc., etc. This is based on
a set of reusable NLP components, which you can also use outside
GATE by embedding them into your own applications (e.g. a news
indexing service). GATE also provides standard tools for manual
annotation and performance evaluation, which are essential during
application development. GATE and its NLP components have been
successfully used in a large number of research projects and
commercial applications.
A summary of features:
- an architecture that describes NLP systems (including embedded
systems) as components, and that defines a set of use cases for
NLP infrastructure
- a framework, or class library, that implements the architecture
- a graphical development environment built on the framework
- re-taskable components (beans), inc. GUI components
- web-loaded components (HTTP, XML config)
- distributed data (JDBC)
- annotation model: "standoff markup", isomorphic with ATLAS,
typing based on XSchema
- annotation differences viewer and automated measurement of
accuracy
- XML I/O, XML system configuration Run-time interoperation with
e.g. XSLT or X-PATH (via re-positioning info)
- JAPE, a pattern language for FST over annotation
- ANNIE, A Nearly-New Information Extraction system for English
To contact the GATE project mail
gate-crashers at dcs.shef.ac.uk
or see the support page
http://gate.ac.uk/support.html
Regards,
Dr. Hamish Cunningham
Department of Computer Science
University of Sheffield, UK
http://www.dcs.shef.ac.uk/~hamish
More information about the Corpora
mailing list