Soft: Tgrep2
alexis nasr
alexis.nasr at lim.univ-mrs.fr
Wed May 30 14:43:53 UTC 2001
The readers of this list may be interested in a new tool, tgrep2, that I
have developed for searching parsed corpora such as those included in
the Penn Treebank.
As the name might suggest, tgrep2 is based on tgrep and is largely
backward compatible. However, tgrep2 adds a number of new features,
including the following major enhancements:
* Rather than simply having a set of required relationships and a set of
prohibited relationships, nodes can have full boolean expressions of
relationships to other nodes.
* Nodes can be given unique labels and may then be referred to by those
labels in the pattern specification or in selecting trees for printing.
* Patterns are no longer restricted to simple tree architectures. The use
of node labels and segmented patterns allows links in a pattern to form
back-edges as well, permitting cycles of links.
* Customizable output formats allow a variety of information to be
reported in a flexible manner.
* Multiple search patterns may be specified and one can retrieve the
first subtree matching any pattern, the first subtree matching each
pattern, or all subtrees matching all patterns.
* Subtrees can be reported using a code rather than by printing the
whole structure. The trees themselves can later be retrieved using the
codes.
* A variety of new links have been added and the immediately-precedes
link now has a more conventional meaning.
* Tgrep2 corpus files are substantially smaller than tgrep corpora.
More information and the tgrep2 software can be found at the following
site:
http://www.cs.cmu.edu/~dr/Tgrep2/
Doug Rohde
Carnegie Mellon University
-------------------------------------------------------------------------
Message diffusé par la liste Langage Naturel <LN at cines.fr>
Informations, abonnement : http://www.biomath.jussieu.fr/LN/LN-F/
English version : http://www.biomath.jussieu.fr/LN/LN/
Archives : http://listserv.linguistlist.org/archives/ln.html
La liste LN est parrainée par l'ATALA (Association pour le Traitement
Automatique des Langues)
Information et adhésion : http://www.atala.org/
-------------------------------------------------------------------------
More information about the Ln
mailing list