"just-in-time" sub-grammar extraction
Vlado Keselj
vkeselj at uwaterloo.ca
Tue Feb 13 21:47:37 UTC 2001
On Tue, 13 Feb 2001, Ann Copestake wrote:
> maybe you'd like to expand on this a bit? The work on subgrammars that
> I know of does not aim for complete consistency with the results of the
> bigger grammar - in fact, generally the point is to cut down on the
> number of readings.
Hi Ann,
Thanks for your reply and the references. I agree that cutting down on
the number of readings is a desirable effect of subgrammar extraction.
So I don't insist on the complete consistency condition; it is just that
the problem seems to be theoretically more elegant if the condition is
included.
> It seems to me that if you have an algorithm that
> shows part of a large grammar isn't applicable to a particular
> sentence, then the obvious thing to do is to use that information to
> cut down the search space, which I would call filtering, rather than
> extraction of a sub-grammar.
...
> But I think this is not what you mean so maybe you'd like to be more
> specific about what you had in mind.
Filtering is very relevant to this problem, so I am looking forward to
checking references that you mentioned. However, filtering is a part of
the parsing process, and they cannot be clearly separated. They should be
separated, and I'll try to explain when and why this is the case:
1. It fits well with a modular approach to NLP (like Zajac and Jan 2000:
Modular unification-based parsers). Two modules can be serialized: the
first module generates a small grammar for a text, and the next module is
a parser, which does the parsing without the burden of a large grammar.
The two problems seem to be different in nature: parsing is an inferencing
process, while subgrammar extraction (with filtering) can benefit from
information retrieval and database technique, also using probabilistic
methods.
In my system for question-answering, given a question and candidate
passages, I generate a small sub-grammar using a Perl program. The actual
parsing is done either by a Java, Lisp, or a Prolog parser.
2. In an Internet application, the parser is a Java applet running on the
client side. A "real-world" grammar is too large to be transfered over
the net. Having a filtering part of the parser on the server and the
rest of the parser on the client is not a well-designed solution.
Server can simply create small grammars and send them to parser in a clean
solution.
Vlado
More information about the HPSG-L
mailing list