Arabic-L:LING:LDC Arabic Morphological Tagger

Dilworth Parkinson dil at BYU.EDU
Sat Jun 20 14:40:03 UTC 2009


------------------------------------------------------------------------
Arabic-L: Sat 20 Jun 2009
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
            unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject:LDC Arabic Morphological Tagger

-------------------------Messages-----------------------------------
1)
Date: 20 Jun 2009
From:ldc at ldc.upenn.edu
Subject:LDC Arabic Morphological Tagger

LDC Introduces its Standard Arabic Morphological Tagger

At a recent LDC Institute seminar, Rushin Shah, a visiting scholar at  
LDC, presented a new tool for corpus annotation, the Standard Arabic  
Morphological Tagger (SAMT).  The current process of Arabic corpus  
annotation at LDC relies on using the Standard Arabic Morphological  
Analyzer (SAMA) to generate various morphology and lemma choices, and  
supplying these to manual annotators who then pick the correct choice.  
SAMA can generate dozens of choices for each word and does not provide  
any information about the likelihood of a particular choice being  
correct.  SAMT addresses these problems by ranking choices in order of  
their probabilities with a high degree of accuracy, and thereby,  
speeds annotation time.

You can view abstracts and presentation slides of this and other  
presentations in LDC's seminar series on data creation on our LDC  
Institute project page.

--------------------------------------------------------------------------
End of Arabic-L:  20 Jun 2009



More information about the Arabic-l mailing list