[Corpora-List] ON using a subject in the SUBJECT line

Khurshid Ahmad kahmad at scss.tcd.ie
Thu Jan 17 20:43:29 UTC 2013


Please folks use the subject line.  I have to open every mail from 
Corpus List as I am not sure whether the mail is relevant or not

On 17-01-2013 20:18, maxwell wrote:
> On 2013-01-17 09:57, Eirini LS wrote:
>> I mean that I have two different scripts for the same word (e.g. two
>> scripts for "cat") written by different people. The first script
>> generates 358 words (and only 107 words are correct), and the second
>> script generates 497 words (and 471 words are correct). Can I say 
>> that
>> the result of the first script is worse or not?
>
> Clearly the recall and precision on the second script are higher.  Of
> course, without knowing what the total number of words that should be
> generated is, it's hard to say more.  In particular, it's hard to say
> whether 471 is good.  (Is the second script getting 471 out of 500
> possible, or 471 out of 50,000?)
>
> In general, though, I think comparing at this gross level is only
> going to give a general sort of answer.  What you really want is a
> test set where each input word is paired with its expected output
> word, so you can do error analysis and regression testing.
>
>    Mike Maxwell
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora

-- 
Best wishes

Khurshid Ahmad. PhD, FBCS, FTCD, CITP
Professor of Computer Science
School of Computer Science and Statistics
Trinity College
Dublin 2
IRELAND

Phone: 00353 1 896 8429 (Labs: 00 353 1 8968435)
Fax 353 1 677 2204
Webpage: www.cs.tcd.ie/khurshid.ahmad

_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora



More information about the Corpora mailing list