[Corpora-List] a problem about overgeneration
Miles Osborne
miles at inf.ed.ac.uk
Fri Sep 5 13:24:52 UTC 2008
the more usual definition of "overgeneration" is that a grammar can be used
to parse strings that are ungrammatical. this used to be seen as a problem
when people didn't use probabilities as such, since then the definition of
being well-formed meant being accepted by some grammar.
nowadays, this is less of a problem since you can simply interpret low
probabilities as being an indicator of a sentence being instead a string (ie
junk). overgeneration could become a problem however if the number of
spurious parses becomes so large that parameters become fragmented, you need
to produce them in a reasonable time / space etc.
Miles
--
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080905/e1426117/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list