"just-in-time" sub-grammar extraction
Vlado Keselj
vkeselj at uwaterloo.ca
Wed Feb 14 16:58:15 UTC 2001
On Wed, 14 Feb 2001, Karel Oliva wrote:
> However, I wonder what is this condition good for. (???)
>
> If you would be ready to drop it, or reduce it to stg. like "partial parsing
> (chart edges) results which possibly participate in building the final
> parse" - which seems quite reasonable, at least for all purposes apart from
> teaching students chart parsing,
Yes, it could replaced with the above condition. For practical purposes,
this condition -- partial results which possibly participate in the final
parse -- is even better. I am only afraid that the problem becomes more
difficult.
I am interested in partial results for two reasons:
1. in question answering: The answers to simple questions are
frequently found in phrases, and we do not need to parse the whole
sentences. Additionally, this makes the approach more robust.
2. theoretically: The partial results are important for compositional
semantics. When two grammars are merged into one, then a partial parse
that cannot participate in a final parse of one grammar, could participate
in a final parse of the resulting grammar.
> then another strategy would be possible, which I used with a considerable
> gain in efficiency in middle- to large-scale parses of Bulgarian and Czech
> some 8yrs ago, and namely:
> that you establish cross-links between lexical classes and rules,
> with the aim that >>each rule is fired (that is, put into the subgrammar
> for parsing the very sentence) only if its lexical trigger is present in
> the sentence<<.
This idea sounds very relevant to what I am trying to do. I will sketch
some of my ideas:
First, a simple algorithm for context-free grammars can be described:
1. put all terminals from the passage into a set S
2. for all rules X -> Y1 .. Yn
3. if Y1, .. Yn \in S then put X in S
4. if S is changed in steps 2-3 then go to step 2
5. the final subgrammar is the input grammar restricted to terminals and
non-terminals in S
For HPSGs, we could replace rules with CF rules in the standard way:
[typeX ... ] -> [typeY1 ... ] ... [typeYn ... ] replaced by
[typeX] -> [typeY1] ... [typeYn]
and apply the algorithm similar to the one for CFGs. It is different
because we have to take into account the type hierarchy. Once a subset of
types is obtained, the type sub-hierarchy and feature subset can be found.
Thank you,
Vlado
More information about the HPSG-L
mailing list