[Corpora-List] ON using a subject in the SUBJECT line
Khurshid Ahmad
kahmad at scss.tcd.ie
Thu Jan 17 20:43:29 UTC 2013
Please folks use the subject line. I have to open every mail from
Corpus List as I am not sure whether the mail is relevant or not
On 17-01-2013 20:18, maxwell wrote:
> On 2013-01-17 09:57, Eirini LS wrote:
>> I mean that I have two different scripts for the same word (e.g. two
>> scripts for "cat") written by different people. The first script
>> generates 358 words (and only 107 words are correct), and the second
>> script generates 497 words (and 471 words are correct). Can I say
>> that
>> the result of the first script is worse or not?
>
> Clearly the recall and precision on the second script are higher. Of
> course, without knowing what the total number of words that should be
> generated is, it's hard to say more. In particular, it's hard to say
> whether 471 is good. (Is the second script getting 471 out of 500
> possible, or 471 out of 50,000?)
>
> In general, though, I think comparing at this gross level is only
> going to give a general sort of answer. What you really want is a
> test set where each input word is paired with its expected output
> word, so you can do error analysis and regression testing.
>
> Mike Maxwell
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
--
Best wishes
Khurshid Ahmad. PhD, FBCS, FTCD, CITP
Professor of Computer Science
School of Computer Science and Statistics
Trinity College
Dublin 2
IRELAND
Phone: 00353 1 896 8429 (Labs: 00 353 1 8968435)
Fax 353 1 677 2204
Webpage: www.cs.tcd.ie/khurshid.ahmad
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora
More information about the Corpora
mailing list