[Corpora-List] a problem about overgeneration

Miles Osborne miles at inf.ed.ac.uk
Fri Sep 5 13:24:52 UTC 2008


the more usual definition of "overgeneration" is that a grammar can be used
to parse strings that are ungrammatical.  this used to be seen as a problem
when people didn't use probabilities as such, since then the definition of
being well-formed meant being accepted by some grammar.

nowadays, this is less of a problem since you can simply interpret low
probabilities as being an indicator of a sentence being instead a string (ie
junk).  overgeneration could become a problem however if the number of
spurious parses becomes so large that parameters become fragmented, you need
to produce them in a reasonable time / space etc.

Miles

-- 
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080905/e1426117/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list