Arabic-L:LING:Lexical Database for Arabic Input

Dilworth Parkinson dilworth_parkinson at BYU.EDU
Fri Apr 21 18:55:47 UTC 2006


------------------------------------------------------------------------
Arabic-L: Fri 21 Apr 2006
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
            unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject:Lexical Database for Arabic Input

-------------------------Messages-----------------------------------
1)
Date: 21 Apr 2006
From:timbuckwalter at qamus.org
Subject:Lexical Database for Arabic Input

The frequency data you describe assumes the use of a corpus. If so, some
lexical distribution figures (aka, dispersion indexes) may be useful,
possibly along the lines of what Kucera & Francis did with the Brown
corpus, or John Carroll's standard frequency index, etc. There are
probably more recent works. Also, if you plan to identify word-sense
subdivisions, you will find the numbered senses in Jan Hoogland's
Arabic-Dutch dictionary quite useful. Hoogland's dictionary is
considerably more up to date that Wehr's lexicon--and it's corpus based.
And finally, some attention to grammatical collocations would be nice:
this would allow us to search for things such as false idafahs, and
maybe even interesting grammatical mistakes.

Tim Buckwalter
Linguistic Data Consortium
Univ. of Pennsylvania

------------------------------------------------------------------------ 
--
End of Arabic-L:  21 Apr 2006



More information about the Arabic-l mailing list