[Corpora-List] Summary of responses: manual annotation tool needed

ted pedersen tpederse at d.umn.edu
Thu Mar 3 15:58:41 UTC 2005


Greetings all,

A week or two ago I posted the following note, and got a number of very nice
responses which I try to summarize below. I got a few responses that mentioned
commercial products, but I haven't included those as we were just looking for
free software solutions...

My original query:

-------
We plan to classify several thousand small units of text (approx 200-500
words each) into a hierarchy of topics. We will do the classification
manually, and we will also design the hierarchy manually. So there is
nothing automatic going on here, we are doing manual annotation.

We would like to have a tool that will let us view these units of text one
by one, and then classify it into our hiearchy. We'd like to be able to
add new topic nodes to the hiearchy, and also move units of text from one
node to another, or maybe even merge or split apart nodes as we refine our
hierarchy.

Are there any free software tools out there that will let us do this
fairly easily? We are able to use Windows or Linux or Solaris for this.
We can convert our data into whatever form would be needed. Right now it
is a big directory structure made up of plain text.
-------

And the suggestions...

http://www.wagsoft.com/Coder/

(In fact, this is the one we are now using, and it seems to be working out reasonably well. It's not
that the other options didn't work, this one just seemed to be fairly easy to deal with and it runs on
Solaris and Windows, which are the platforms we need...)

Two responses suggested the CLaRK system

http://www.bultreebank.org/clark/index.html

which is a very nicely put together tool with lots of documentation, etc.

Another user suggested the "kate" editor for linux, with the following explanation...

    It is not exactly what you need, but you may be interested in having
    a look to   "kate", which is a text editor that comes with KDE.
    It has a window with an editor area (to view the text), a folder
    management  area (to view and manipulate folders and files, which you can use to
    manage  categories), and a command line area  (which you will probably not need,
    and may be dismissed)

Another responder suggested we check out QDA software, and provided the following link which contains a
comparative analysis of various options, some commercial, some not...

http://www.lboro.ac.uk/research/mmethods/research/software/caqdas_comparison.html

Another response pointed us to this demo:

http://nlp.cs.jhu.edu/~gsm/pd_demo

with the following instructions...

        To see what it can do, from the main menu choose :
	"Choose Name" and type "Abby Watkins"
	and then select "Load saved clustering" from the Main menu.
	At that point you can view the source web pages, and rearrange the
	clustering.  It's in Java, and runs over the network (there's a UNIX back-end).

Another response mentioned MMAX, which charges for its more recent version, but the
earlier free version seems to work well (according to the responder)

http://www.eml-research.de/english/research/nlp/download/index.php

And then another response pointed us here, which is an impressive tool...

http://annotation.semanticweb.org/ontomat/index.html

Thanks very much for all these great responses, it was really a big help. If there are other tools out
there, please feel free to mention them as well. While we like the Coder so far, we are always willing
to look at other options (and we'll be checking out several of the above options a bit more extensively
as time goes by...)

Cordially,
Ted

--
Ted Pedersen
http://www.d.umn.edu/~tpederse



More information about the Corpora mailing list