[Corpora-List] (no subject)

maxwell maxwell at umiacs.umd.edu
Thu Jan 17 20:18:33 UTC 2013


On 2013-01-17 09:57, Eirini LS wrote:
> I mean that I have two different scripts for the same word (e.g. two
> scripts for "cat") written by different people. The first script
> generates 358 words (and only 107 words are correct), and the second
> script generates 497 words (and 471 words are correct). Can I say 
> that
> the result of the first script is worse or not?

Clearly the recall and precision on the second script are higher.  Of 
course, without knowing what the total number of words that should be 
generated is, it's hard to say more.  In particular, it's hard to say 
whether 471 is good.  (Is the second script getting 471 out of 500 
possible, or 471 out of 50,000?)

In general, though, I think comparing at this gross level is only going 
to give a general sort of answer.  What you really want is a test set 
where each input word is paired with its expected output word, so you 
can do error analysis and regression testing.

    Mike Maxwell

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list