Arabic-L:LING:LDC Arabic Morphological Tagger
Dilworth Parkinson
dil at BYU.EDU
Sat Jun 20 14:40:03 UTC 2009
------------------------------------------------------------------------
Arabic-L: Sat 20 Jun 2009
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
unsubscribe arabic-l ]
-------------------------Directory------------------------------------
1) Subject:LDC Arabic Morphological Tagger
-------------------------Messages-----------------------------------
1)
Date: 20 Jun 2009
From:ldc at ldc.upenn.edu
Subject:LDC Arabic Morphological Tagger
LDC Introduces its Standard Arabic Morphological Tagger
At a recent LDC Institute seminar, Rushin Shah, a visiting scholar at
LDC, presented a new tool for corpus annotation, the Standard Arabic
Morphological Tagger (SAMT). The current process of Arabic corpus
annotation at LDC relies on using the Standard Arabic Morphological
Analyzer (SAMA) to generate various morphology and lemma choices, and
supplying these to manual annotators who then pick the correct choice.
SAMA can generate dozens of choices for each word and does not provide
any information about the likelihood of a particular choice being
correct. SAMT addresses these problems by ranking choices in order of
their probabilities with a high degree of accuracy, and thereby,
speeds annotation time.
You can view abstracts and presentation slides of this and other
presentations in LDC's seminar series on data creation on our LDC
Institute project page.
--------------------------------------------------------------------------
End of Arabic-L: 20 Jun 2009
More information about the Arabic-l
mailing list