Arabic-L:LING:Needs Help creating Arabic Gigaword Ground Truth annotation

Dilworth Parkinson dilworth_parkinson at BYU.EDU
Tue Dec 4 19:04:35 UTC 2007


------------------------------------------------------------------------
Arabic-L: Tue 04 Dec 2007
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
            unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject:Needs Help creating Arabic Gigaword Ground Truth annotation

-------------------------Messages-----------------------------------
1)
Date: 04 Dec 2007
From:a.nwesri at student.rmit.edu.au
Subject:Needs Help creating Arabic Gigaword Ground Truth annotation

Hi,

I am a PhD student doing a research on Arabic IR at the RMIT  
university, Melbourne, Australia.

I am currently creating a manual ground truth for the Arabic giga word  
collection. I need this to evaluate current Arabic IR systems and to  
test my new algorithms.

I posted the message below  to the Corpora newsgroup and was advised  
to post this in the Arabic-L newsgroup. Unfortunately, I am not a  
member in this group.

I hope that you can post this in the Arabic-L news group so that many  
people who are working in the field can participate in this work.

I  highly appreciate your assistance.

Thanks
-------------------------------------------------------

Hi,

I have built a tool to create a manual judgement for the Arabic giga  
Word (AGW) corpus. The corpus is by far the biggest available from LDC.

Most Arabic Information retrieval systems have been evaluated using  
the AFP TREC2001 corpus and 75 queries. The corpus is relatively small  
compared to English corpora. AGW is five times bigger than the  
TREC2001 corpus.

Currently I have used a group of 20 people and have collected about  
20000 judgement for around 80 queries.

If you are an Arabic native speaker, I would highly appreciate your  
contribution to build this ground truth. If you can add one topic and  
find its relevant documents and mark them you would contribute another  
topic to the judgement. I am looking to get as more judgements as  
possible.

I will make this ground truth available to the research community once  
I finish my evaluations.

The link for the annotation tool is

http://goanna.cs.rmit.edu.au/~nwesri/agw/index.php

Once again I am looking for your support and hope that this will  
benefit the Arabic IR.


Thanks In Advance,

Abdusalam Nwesri
PhD Student,
School of Computer Science and IT,
RMIT University,
Melbourne,
Australia.
--------------------------------------------------------------------------
End of Arabic-L:  04 Dec 2007
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20071204/4f5596fb/attachment.htm>


More information about the Arabic-l mailing list