PENN TREE BANK STYLE NLP SOFTWARE AVAILABLE

Anne Sing annes at HTDC.ORG
Tue Jan 27 05:16:18 UTC 1998


To the readers;
Derek Bickerton and Phil Bralich of Ergo Linguistics Technology would like
to announce the release of free software to the Computational Linguistics,
NLP, MT, and linguistics communities.  The software offering is a
pre-release called "BracketDoctor."  It provides a parsed analysis of input
strings including labeled brackets and trees in the style of the Penn
Treebank of the Linguistic Data Consortium as outlined in "Bracketing
Guidelines for Treebank II Style Penn Treebank Project" (Linguistic Data
Consortium 1995).  While the entire range of structures of that work is not
supported, this is the only parser that can generate any such trees and
brackets and thus represents a major breakthrough for this field.

We understand that this is unlikely to be nominated for citations or awards,
but as this is the only software available that can generate such labeled
brackets and trees, we believe it is an important contribution to this field
of research and it should be of value to researchers in academia and
industry alike as well as to students working through their introductory
syntax text books.  We are announcing this release to linguistics
news-lists, translation lists, and the like as well to our entire mailing
list of researchers and decision makers in industry, government, and
academia.  We feel this release is particularly important because even the
major universities such as Stanford and MIT as well as companies such as
Microsoft, IBM, and Xerox do not have programs that offer this sort of
demonstration of their ability to work with the Penn Treebank styles.

Of course we recognize the importance of being aware of the entire field of
NLP and of not misrepresenting such things to government, industry, or
academia, so we feel it is important to distribute this as widely as
possible as quickly as possible.  As this is the only parser that generates
Penn Treebank style labeled bracketings and trees, and as NLP, Linguistics,
and Computational Linguistics communities have agreed that the Penn Treebank
styles are the standard for this field, we feel compelled to suggest that this
parser be accepted as the default standard for parsers in the field today
until such time as other parsers can show that they can do an equal or
better job with the Penn Treebank style book, or until such time as the Penn
Treebank styles are removed as the standard.  We will also be distributing
this software to members of the LDC, EAGLES, the organizers of the MUC
conferences, and other organizations that propose to set standards for NLP.
(For possible alternative standards for NLP other than the these go to
http://www.vrml.org/WorkingGroups/NLP-ANIM).

We realize such claims as these may invite accusations of arrogance of
immodesty, but what is the point of having such standards as the Penn
Treebank II guidelines if the one parser that can generate them is NOT given
a central role in the field as a whole and in the journals as the standard
against which all other parsers must be measured.

As long as we are the default standard for the generation of trees and
brackets in the Penn Treebank style, then many publications and proposals
in NLP will need to mention this software in their review of current
technologies and work.  For that purpose, the reference should refer to
Philip Bralich and Derek Bickerton, 1998.  "BracketDoctor," Ergo Linguistic
Technologies, Honolulu, Hawaii.

The BracketDoctor can be obtained by writing to Derek Bickerton
(derek at hawaii.edu) or Phil Bralich (bralich at hawaii.edu) or it can be downloaded
from our web site.  It is a standard Windows 95 program in a setup file.  It
requires 1000 kilobytes of space and less than one megabyte of ram to run.
Sentences parse in real time.

Phil Bralich

P.S. For those who can sign a non-disclosure agreement it is also possible
to receive the product called "MemoMaster" which demonstrates our abilities
with: 1) question/answer, statement/response repartee (using notes and
reminders), 2) NLP messaging for sending faxes, email, and memos, and 3)
command and control for browsers and operating systems (a great add- on for
any speech rec system).  Just email me or a send a fax to (808)539-3924
requesting the non-disclosure.


Philip A. Bralich, Ph.D.
President and CEO
Ergo Linguistic Technologies
2800 Woodlawn Drive, Suite 175
Honolulu, HI 96822

Tel: (808)539-3920
Fax: (808)5393924



More information about the Funknet mailing list