[Corpora-List] Metrics for corpus "parseability"
Sandra Kuebler
skuebler at indiana.edu
Mon Feb 4 22:27:56 UTC 2008
There is related work about the ambiguity of grammars induced from
treebanks. Anna Corazza, Alberto Lavelli, and Giorgio Satta used
conditional cross entropy for that. This may help to at least
abstract away from the parser :)
Sandra
On Feb 4, 2008, at 5:21 PM, Miles Osborne wrote:
> Chris Brew suggested I actually explain what it is I meant: here
> is a sample paper on phase transitions in solving problems like 3-sat:
>
> http://www.sciencemag.org/cgi/content/abstract/264/5163/1297
>
> Props to Chris!
>
> Miles
>
> On 04/02/2008, Miles Osborne <miles at inf.ed.ac.uk> wrote:
> I must confess, the idea that a corpus can be described in terms of
> "parseability" sounds a little ill-founded to me. The choice of
> particular parsing algorithm may dictate which examples are hard to
> process, as will the underlying grammar etc etc.
>
> What would be interesting (read: hard) would be to look at the
> work on phase transitions in 3-sat problems and the like. So, are
> there underlying graph-related characteristics of parsing which
> make certain sentences intrinsically hard to process and in
> particular can these characteristics be framed in a manner that was
> independent of the actual parser.
>
> Miles
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> _______________________________________________
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
Sandra Kuebler
Indiana University
Department of Linguistics
Memorial Hall 322
1021 E. Third Street
Bloomington IN 47405
USA
phone: (812) 855-3268
fax: (812) 855-5363
email: skuebler at indiana.edu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080204/af271b91/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list