7.1241, Sum: Grammar development environments
The Linguist List
linguist at tam2000.tamu.edu
Sat Sep 7 14:34:53 UTC 1996
---------------------------------------------------------------------------
LINGUIST List: Vol-7-1241. Sat Sep 7 1996. ISSN: 1068-4875. Lines: 262
Subject: 7.1241, Sum: Grammar development environments
Moderators: Anthony Rodrigues Aristar: Texas A&M U. <aristar at tam2000.tamu.edu>
Helen Dry: Eastern Michigan U. <hdry at emunix.emich.edu> (On Leave)
T. Daniel Seely: Eastern Michigan U. <dseely at emunix.emich.edu>
Associate Editor: Ljuba Veselinova <lveselin at emunix.emich.edu>
Assistant Editors: Ron Reck <rreck at emunix.emich.edu>
Ann Dizdar <dizdar at tam2000.tamu.edu>
Annemarie Valdez <avaldez at emunix.emich.edu>
Software development: John H. Remmers <remmers at emunix.emich.edu>
Editor for this issue: dseely at emunix.emich.edu (T. Daniel Seely)
---------------------------------Directory-----------------------------------
1)
Date: Thu, 05 Sep 1996 10:50:16 -0000
From: paul at ccl.umist.ac.uk (Paul Bennett)
Subject: Grammar development environments - summary
---------------------------------Messages------------------------------------
1)
Date: Thu, 05 Sep 1996 10:50:16 -0000
From: paul at ccl.umist.ac.uk (Paul Bennett)
Subject: Grammar development environments - summary
A little while ago I asked for information about grammar development
environments (GDEs) as teaching tools. The particular questions I had
were as follows:
1. Information on availability
2. Hardware/software requirements
3. Range of formalisms supported
4. Extent of customisability if any
5. Ease of writing and testing grammars
6. Quality of display of analyses on screen
7. Ease of debugging grammars
8. Speed and reliability
9. General user-friendliness
10. How used in teaching, and with what kind of students
I'm very grateful to those who responded, viz.:
Melina Alexa: alexa at darmstadt.gmd.de
Charles Boisvert: charles at ccl.umist.ac.uk
Ian Crookston: I.Crookston at uk.ac.lmu
Chris Culy: cculy at edu.uiowa.weeg.vaxa
Mary Dalrymple: dalrymple at parc.xerox.com
Stephen Nightingale: night at uk.ac.ed.ling
Bilge Say: say at bilkent.edu.tr
Nancy Underwood: nancy at dk.ku.cst
Martin Volk: volk at ch.unizh.ifi
Shuly Wintner: shuly at cs.technion.ac.il
- --------------------------------
Here is a summary of the systems available, and the information I
received:
- -
A useful web address is Natural Language Software Registry Homepage at
http://cl-www.dfki.uni-sb.de/cl/registry/draft.html#top
- -
`Syntactica' by Richard Larson et al. is a teaching tool designed for
undergraduates. See MIT Press linguistics catalogue. For PCs running NeXTStep
or NeXT stations, but a Windows 95 version is apparently promised for next
year.
- -
Chris Culy has writen a simple GDE for CFGs using HyperCrad on the Mac.
It's shareware (US$10). It's available through:
http://www.uiowa.edu/~linguist/classes/lfr-fall96/index.html
- -
SIL has a parser/lexicon as part of its PC-PATR. See:
http://www.sil.org/
- -
Linguistic Instruments in Gothenburg have written some grammar
development environments for the Mac. The formalisms covered are CFG,
PATR, categorial grammar and DCG. The user interface is fairly
friendly, with nice graphics for structures. This is commercial
software; a site licence for the whole system costs $500. Contact:
lager at se.gu.ling.
- -
A number of LFG-specific systems are available. Here is a summary of the
information I received from Mary Dalrymple:
a. The Xerox LFG Grammar Writer's Workbench is a complete parsing
implementation of the LFG syntactic formalism, including various
features introduced since the original KB82 paper (functional
uncertainty, functional precedence, generalization for coordination,
multiple projections, etc.). Runs under DOS on PCs and on most Unix
systems. Contact Ron Kaplan (kaplan at parc.xerox.com)
b. Avery Andrews has written a small LFG system that runs on PC's (XT's,
in fact), that is basically orientated towards producing small fragments
to illustrate aspects of grammatical analysis in basic LFG. Contact:
Avery.Andrews at anu.edu.au
Available from: http://www.anu.edu.au/linguistics/software/lfg20.exe
c. Charon is available from ftp.ims.uni-stuttgart.de
in the directory /pub/Charon. Requires Unix and a Prolog which conforms
to the Edinburgh syntax. Contact: Dieter Kohl
(dieter at ims.uni-stuttgart.de)
d. The Konstanz LFG Workbench is a simple 'LFG-Workbench' that is used
for an introductory LFG course. It accepts syntax rules
in very nearly conventional notation with simple functional equations
(no boolean operators) and constraint equations (=c), and allows you to
project an f-structure to a set of semantic implications via lexical
semantics written in a Prolog-like notation. Contact Bruce Mayo
(bruce.mayo at pan.rz.uni-konstanz.de)
- -
XGrammarTool (from GMD IPSI) is a Smalltalk-based toolkit and has a
general top-down parser for user-specified grammars which can be written
in a BNF-like language. It hasn't really been used in teaching, though. It
needs VisualWorks (Smalltalk) from ParcPlace, either 2.0 or 2.5. A short
description can be found as part of a paper in Electronic Publishing vol
6, 1993, pp. 495-505. Contact rostek at de.gmd.darmstadt.
- -
See COLING 1996 Proceedings (vol2, pp 1057-60) for a description of the
GATE project from Sheffield.
- -
Bob Carpenter's ALE is available at http://macduff.andrew.cmu.edu/ale/
Requires Quintus/Sicstus Prolog, but nothing else.
This provides for writing grammars using typed feature structures, in
anything from PATR to HPSG and CCG. It's very reliable, and
there's a very good, 100-page long user manual. A home page is
dedicated to ALE with lots of information. The manual is available in
HTML. There's also a Web site titled "Course Notes on HPSG in ALE", by
Colin Matheson, which can be extremely useful if you're planning to
teach HPSG. The URL is:
http://www.ltg.hcrc.ed.ac.uk/projects/ledtools/ale-hpsg/index.html
ALE requires a fair amount of linguistic and formal sophistication, however.
- -
Micro-NLP is a system specifically designed to teach grammar development
and aspects of parsing. It is written by Charles Boisvert at UMIST. The
answers to my questions (see above) are:
1. Contact charles at ccl.umist.ac.uk.
Also http://www.ccl.umist.ac.uk/charles/micro-nlp.html
2. Runs on SICStus 2.1#7 on SUNs
3. Context free unification grammars, with Prolog term or
feature:value set unification. Disjunction and negation of (atomic) values.
Several example grammars are included:
- a DCG
- a context-free grammar using features
- a grammar with a slash feature for gap-threading
- a lexical-based grammar with 2 non-terminals
(and a head-corner generator)
4. At a basic level, traces can be switched off and controlled
(like Prolog traces). There are 2 parsers and lots of
different ways to print feature structures. A Prolog programmer could
re-use elements and make e.g. their own parsers, or use the feature
structures for completely different purposes.
5. Grammars are edited in a text editor and saved/consulted like
programs. There is no need to compile the grammar before
testing. For feature sets, I follow the syntax of Gazdar &
Mellish, which is described in their textbook.
Testing is similar to what you obtain from Prolog, so you
can analyse strings, generate strings with given
characteristics, look at alternative parses/generated
phrases.
6. It is text rather than graphics, but there is a very robust
routine for pretty printing mixtures of terms and feature
structures.
7. The tracers let you step through a top-down or a left-corner
parser, which is a good way to identify bugs (if using one
parser gives no clue, try the other one). The possibility to
generate phrases is also useful. Last, because the code is
interpreted, it is easy to make small changes at a time, and
debug progressively, in a edit-save-consult-test iterative process.
8. Efficiency has not been my main concern with this system. The
parsers were written to step through them and the grammars are
interpreted, so it is not surprising if it is slow. On my toy
grammars, which have in the order of 20-30 lexical entries and
up to ten rules, parsing times vary from 0.1 to several seconds.
9. Micro-NLP isn't a mouse-and-graphic system, but
its flexibility is user-friendliness: easy to write grammars,
easy to run, easy to read the results and the processes, easy
to relate to Prolog if required. Because of that
flexibility, a user who has done a bit of Prolog would feel
at home, but vice-versa, because of the good presentation of
data structures, a user could also start with Micro-NLP,
become familiar with search techniques thanks to the high level
tracing and good presentation, then move on to Prolog.
- -
GTU (``Grammatik Testumgebung'') is a large GDE for German written at
the University of Koblenz. See:
@InProceedings{Volk95b,
author = "M. Volk and M. Jung and D. Richarz",
title = "{GTU - A workbench for the development of natural
language grammars}",
booktitle = "Proc. of the Conference on Practical Applications of
Prolog",
year = 1995,
pages = {637-660},
address = "Paris"
}
Questionnaire answers:
1. GTU can be obtained for a nominal fee from the University of Koblenz.
Contact Dirk Richarz at richarz at informatik.uni-koblenz.de
2. GTU runs on SUN Workstations. It is compiled SICStus-Prolog code.
3. DCG (with feature structures), ID/LP (with feature structures), LFG, GPSG
4. Many features can be switched on/off. Among them:
- checking selectional restrictions
- computing logical forms
- access to 3 different lexicon accesses
- access to two different test suite interfaces
5. flexible lexicon interface (three different lexicons can be hooked to
any grammar), test suite administration tool, special hypertext help
6. Graphic display of c-structure-trees, f-structures, feature
structures
7. - GTU includes static grammar checks (e.g. for circular LP-rules).
- GTU has a tool to compare parse-trees with previously computed
parse-trees.
- The test suite can be partioned and fed into any of the parsers.
8. Very reliable. Very fast for small grammars and small lexicons.
9. Generally regarded as high. Although GTU has now grown into a complex
system that takes some time to learn.
10. CL students in the course ``Methods of syntax analysis'' are asked
to write grammars (or compile test sentences) for special syntactic
phenomena.
- -
LRP/C - Masaru Tomita's Parser Compiler environment. Here (abbreviated
by me) are the questionnaire answers from Shuly Wintner:
1. Not clear - should be available from CMU - or check with Alon Lavie,
alavie+ at f.gp.cs.cmu.edu.
2. The system is written in Lisp. You should have a common lisp
environment. I managed to run it on various versions of Sun machines,
under various versions of Unix.
3. LRP/C was designed with LFG in mind, but it can be used for a wide
range of phrase structure grammars. It has built-in unification and a
hook to Lisp.
4. Source code available.
5. Not bad. A simple but useful debugging tool, tracer etc.
6. Poor - all output is linearly displayed, as (very long) lists.
7 and 8. Performance wasn't bad, as far as I remember, but the system
used to behave strangely on very complex grammars (I have written an
extensive grammar for Hebrew using it). For teaching purposes I can't
foresee any problem at all.
9. So-so. There used to be a user's manual and a technical report
describing the system. No on-line help of any kind.
10. Mostly undergraduate CS students with no knowledge of linguistics.
===
Paul Bennett
UMIST
------------------------------------------------------------------------
LINGUIST List: Vol-7-1241.
More information about the LINGUIST
mailing list