[Corpora-List] Metrics for corpus "parseability"
Miles Osborne
miles at inf.ed.ac.uk
Tue Feb 5 08:28:19 UTC 2008
Actually, I think you have misunderstood what I said: this truly is about
the data and not about "algorithms". What I said was that you need to be
able to understand about the hardness of the sentences themselves, without
reference to the parser etc. Read that sample paper and you will know what
I mean.
Miles
On 05/02/2008, Adam Kilgarriff <adam at lexmasterclass.com> wrote:
>
> On 04/02/2008, Miles Osborne <miles at inf.ed.ac.uk> wrote:
> >
> > I must confess, the idea that a corpus can be described in terms of
> > "parseability" sounds a little ill-founded to me. The choice of particular
> > parsing algorithm may dictate which examples are hard to process, as will
> > the underlying grammar etc etc.
>
>
> I couldn't disagree more. It's the equivalent of saying that it's
> ill-founded to evaluate parsers because they will always perform differently
> on different corpora. It just goes to show that you're interested in
> algorithms not data. The field is way imbalanced by people who think more
> about algorithms than the corpora they apply them to.
>
> Adam
>
>
> --
> > ================================================
> > Adam Kilgarriff
> > http://www.kilgarriff.co.uk
> > Lexical Computing Ltd http://www.sketchengine.co.uk
> > Lexicography MasterClass Ltd http://www.lexmasterclass.com
> > Universities of Leeds and Sussex adam at lexmasterclass.com
> > ================================================
>
>
--
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20080205/0c71789f/attachment.htm>
-------------- next part --------------
_______________________________________________
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list