"just-in-time" sub-grammar extraction

Wed Feb 14 16:58:15 UTC 2001

On Wed, 14 Feb 2001, Karel Oliva wrote:

> However, I wonder what is this condition good for. (???)
>
> If you would be ready to drop it, or reduce it to stg. like "partial parsing
> (chart edges) results which possibly participate in building the final
> parse" - which seems quite reasonable, at least for all purposes apart from
> teaching students chart parsing,

Yes, it could replaced with the above condition.  For practical purposes,
this condition -- partial results which possibly participate in the final
parse -- is even better.  I am only afraid that the problem becomes more
difficult.

I am interested in partial results for two reasons:

1. in question answering: The answers to simple questions are
frequently found in phrases, and we do not need to parse the whole
sentences.  Additionally, this makes the approach more robust.

2. theoretically: The partial results are important for compositional
semantics.  When two grammars are merged into one, then a partial parse
that cannot participate in a final parse of one grammar, could participate
in a final parse of the resulting grammar.

> then another strategy would be possible, which I used with a considerable
> gain in efficiency in middle- to large-scale parses of Bulgarian and Czech
> some 8yrs ago, and namely:
>          that you establish cross-links between lexical classes and rules,
> with the aim that >>each rule is fired (that is, put into the subgrammar
> for parsing the very sentence) only if its lexical trigger is present in
> the sentence<<.

This idea sounds very relevant to what I am trying to do.  I will sketch
some of my ideas:

First, a simple algorithm for context-free grammars can be described:
1. put all terminals from the passage into a set S
2. for all rules X -> Y1 .. Yn
3.     if Y1, .. Yn \in S then put X in S
4. if S is changed in steps 2-3 then go to step 2
5. the final subgrammar is the input grammar restricted to terminals and
   non-terminals in S

For HPSGs, we could replace rules with CF rules in the standard way:
[typeX ... ] -> [typeY1 ... ] ... [typeYn ... ] replaced by
[typeX] -> [typeY1] ... [typeYn]

and apply the algorithm similar to the one for CFGs.  It is different
because we have to take into account the type hierarchy.  Once a subset of
types is obtained, the type sub-hierarchy and feature subset can be found.

Thank you,

Vlado