LFG systems and LFG computer implementations
Martin Forst
martin.forst at web.de
Mon Apr 5 19:12:18 UTC 2010
Hi Lori,
As Mark has already pointed out, XLE provides machinery for parse
ranking based on discriminative log-linear models. For English, these
have primarily been trained on the WSJ part of the PTB, but we (the NL
team at Powerset) are now moving to our own annotated data, which are
produced by means of the LFG Parsebanker from the University of
Bergen. For German, they have been trained on the TIGER Treebank; you
can find more details in my thesis, which is attached. I also know
that the Fuji Xerox team that develops the Japanese ParGram grammar
has trained such models, but I'm not sure which annotated corpus they
have been using. Finally, the Norwegians have done initial experiments
in parse ranking based on relatively small sets of data produced with
their Parsebanker tool; as far as I know, they have been surprisingly
successful given the small size of those data sets. Most of the
remaining ParGram grammars are probably still struggling with the
coverage needed to parse corpora and with the availability of
treebanks, but the idea definitely is to ultimately complement those
symbolic grammars with machine-learned models, too.
Another method that uses training data which may come from treebanks
is what we call c-structure pruning. It is basically a PCFG-based way
to reduce the number of c-structures for which you solve the f-
annotations and thereby speed up the parser and get more full analyses
due to a reduced number of timeouts. Aoife Cahill, John Maxwell, Tracy
King, and Paul Meurer have publications on this.
Finally, I tried learning the ranking of the OT marks used in the
German grammar from TIGER Treebank data at some point - with some but
not huge success. You can find a paper on that experiment in the LFG
2005 Proceedings: http://csli-publications.stanford.edu/LFG/10/lfg05forstetal.pdf
.
Best regards,
Martin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: thesis-forst.pdf
Type: application/pdf
Size: 2262824 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/lfg/attachments/20100405/6078ebc1/attachment.pdf>
-------------- next part --------------
More information about the LFG
mailing list