"just-in-time" sub-grammar extraction

Tue Feb 13 21:47:37 UTC 2001

On Tue, 13 Feb 2001, Ann Copestake wrote:

> maybe you'd like to expand on this a bit?  The work on subgrammars that
> I know of does not aim for complete consistency with the results of the
> bigger grammar - in fact, generally the point is to cut down on the
> number of readings.

Hi Ann,

Thanks for your reply and the references.  I agree that cutting down on
the number of readings is a desirable effect of subgrammar extraction.
So I don't insist on the complete consistency condition; it is just that
the problem seems to be theoretically more elegant if the condition is
included.

>  It seems to me that if you have an algorithm that
> shows part of a large grammar isn't applicable to a particular
> sentence, then the obvious thing to do is to use that information to
> cut down the search space, which I would call filtering, rather than
> extraction of a sub-grammar.
...
> But I think this is not what you mean so maybe you'd like to be more
> specific about what you had in mind.

Filtering is very relevant to this problem, so I am looking forward to
checking references that you mentioned.  However, filtering is a part of
the parsing process, and they cannot be clearly separated.  They should be
separated, and I'll try to explain when and why this is the case:

1. It fits well with a modular approach to NLP (like Zajac and Jan 2000:
Modular unification-based parsers).  Two modules can be serialized: the
first module generates a small grammar for a text, and the next module is
a parser, which does the parsing without the burden of a large grammar.

The two problems seem to be different in nature: parsing is an inferencing
process, while subgrammar extraction (with filtering) can benefit from
information retrieval and database technique, also using probabilistic
methods.

In my system for question-answering, given a question and candidate
passages, I generate a small sub-grammar using a Perl program.  The actual
parsing is done either by a Java, Lisp, or a Prolog parser.

2. In an Internet application, the parser is a Java applet running on the
client side.  A "real-world" grammar is too large to be transfered over
the net.  Having a filtering part of the parser on the server and the
rest of the parser on the client is not a well-designed solution.
Server can simply create small grammars and send them to parser in a clean
solution.

Vlado