[Corpora-List] annotation tools - summary of replies
Alberto Lavelli
lavelli at itc.it
Tue Jul 13 12:48:15 UTC 2004
This is the summary of the responses I received to the following query
I sent at the beginning of July:
> I'm interested in graphical tools for manual annotation of texts. The
> goal is to manually annotate documents to train/test IE systems. In
> particular, I'm interested in tools that allow to annotate not only
> entities (e.g. Named Entities) but also relations between such
> entities (e.g., the relations of the Template Relation task in MUC-7:
> employee_of connecting person and organization, or location_of
> connecting organization and location). Tools I'm already aware of:
>
> - the ALEMBIC Workbench by MITRE (already downloaded; it's the tool
> I'm more familiar with)
> - WordFreak (downloaded a few days ago; one of the problems with
> WordFreak seems to be the shortage of documentation)
> - the ACE annotation tools by LDC
>
> I have had a look at the old messages of the list without being able
> to find anything interesting. I have already consulted the web page
> on Linguistic Annotation by Steven Bird and Mark Liberman
> (http://www.ldc.upenn.edu/annotation/) but I have found nothing which
> appears to be relevant (BTW, the last update was in December 2001).
> I'm particularly interested in first-hand experience of use of the
> tools (including the ones mentioned above).
First of all, thanks to the following people who replied to my query:
Ivana Kruijff-Korbayov?, David Day, Jean Carletta, Caren Brinckmann,
Rodrigo Goulart, Narjes Boufaden, Constantin Orasan, Gerard Peregrin,
and Tom Morton. Some of them are directly involved in the development
of the tools mentioned below, some are simply users of the tools. The
people directly involved in the development of the tools have offered
to provide their help. I think they will be glad to be contacted by
people interested in using their tools.
Here are the web pages of the tools mentioned in my original message
(I didn't provide them in my original message but I think may be
useful):
- the ALEMBIC Workbench by MITRE:
http://www.mitre.org/technology/alembic-workbench/
- WordFreak: http://sourceforge.net/projects/wordfreak
- the ACE annotation tools by LDC:
http://www.ldc.upenn.edu/Projects/ACE/Tools/
This is the list of the tools mentioned by other people (together with
their web pages):
- MMAX, developed at EML Heidelberg: http://mmax.eml-research.de
- Callisto, by MITRE: http://callisto.mitre.org/
- NXT (NITE XML Toolkit): http://www.ltg.ed.ac.uk/NITE
- GATE, by the University of Sheffield: http://gate.ac.uk/
- PALinkA: http://clg.wlv.ac.uk/projects/PALinkA/
- Clark System: http://www.BulTreeBank.org/
Below I have collected some comments on the tools by the people who
replied. Note that I have not yet tried the tools so I can't make any
reasonable consideration about their suitability for my annotation
task.
MMAX has been suggested by three different people who have been using
it and are satisfied by its performances.
MITRE: David Day wrote that they have developed a new annotation tool
called Callisto. It was just recently made available on an internet
web site: http://callisto.mitre.org/. It represents a different
design philosophy from the Alembic Workbench, in that it accepts
independently-developed plug-ins to support individual annotation
tasks. For a variety of tasks (e.g., ACE, EELD and BioInformatics)
instances of task interfaces have been developed in which the user can
establish relationships among different phrases, or among different
annotations on phrases. The number and kind of annotation tasks
supported is growing. For some "generic" tasks or those for which a
lot of variation is expected, a declarative means of modifying the
possible values and constraints between those values has been
developed (using the Relax NG system).
NXT: users of NXT (the NITE XML Toolkit, a library that supports the
building of tools for hand-annotation) have built applications for
both named entity annotation and anaphoric links between entities. It
would not be a big effort to cobble a tailored tool together based on
these applications.
WordFreak: one of the authors of WordFreak (Tom Morton) wrote that the
best way to obtain information about WordFreak is to post questions to
the sourceforge Forums for WordFreak
(http://sourceforge.net/forum/?group_id=75013). Typically he adds
docs for any question he answers so the next person doesn't have to
ask it. WordFreak has relation support which is actively used to do
biological annotation.
I hope I have not forgotten anything relevant.
best
alberto
More information about the Corpora
mailing list