[Corpora-List] Algorithm for hypernym chaining?

Adam Funk a.funk at dcs.shef.ac.uk
Thu Jul 1 09:42:06 UTC 2010


[25/06/10 15:50] Ken Litkowski wrote:
> I think that graph algorithms (search for "algorithms for directed
> graphs") should do the trick efficiently. The key to your problem is
> that each of your entries constitutes an "edge" between "vertices". You
> compile a list of such edges and the algorithms then go through a
> depth-first search to construct the full digraph. Although these
> algorithms are NP-complete, the number of vertices (e.g., in WordNet) is
> sufficiently small that your task can be accomplished in at most a
> matter of a few minutes. WordNet has the added advantage that there are
> no cycles. (I have C++ code I'm willing to share, but it might not be
> the most efficient for your purposes.)

Thanks very much for this advice.  I wrote a program that reads the file
and generates a set of nodes, each with a number, a set of daughters,
and a set of parents; finds all the root nodes (with no parents) and
then extends all possible chains from them.  It ran in a surprisingly
short time (a few seconds in Java).

I obtained 45 chains with "loops" in them (I wrote the program to stop
extending a chain when it adds a node that it detects is already in the
chain higher up), and they all contain the sequence
"202422663,202423762".  It turns out that this is the "restrain ->
inhibit" loop that Steven Bethard warned me about.  According to my
calculations this is the only circularity left in WordNet!

_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list