Arabic-L:LING:Arabic text analysis software responses

Dilworth Parkinson dilworthparkinson at GMAIL.COM
Fri Sep 13 15:39:23 UTC 2013


------------------------------------------------------------------------
Arabic-L: Thu 12 Sep 2013
Moderator: Dilworth Parkinson <dilworth_parkinson at byu.edu>
[To post messages to the list, send them to arabic-l at byu.edu]
[To unsubscribe, send message from same address you subscribed from to
listserv at byu.edu with first line reading:
           unsubscribe arabic-l                                      ]

-------------------------Directory------------------------------------

1) Subject: Arabic text analysis software response
2) Subject: Arabic text analysis software response
3) Subject: Arabic text analysis software response

-------------------------Messages-----------------------------------
1)
Date: 12 Sep 2013
From: Eric Atwell <E.S.Atwell at leeds.ac.uk>
Subject: Arabic text analysis software response

I recommend you try http://sketchengine.co.uk/ 30-day free trial.
This website allows you to upload your own Arabic corpus, or use an
existing corpus on the website, or you can even use the web-crawler to
collect a corpus from your own chosen websites. Then you can
automatically extract wordlists, keywords, terms, and thesauri;
compare and contrast usages of words; and extract lexical patterns.
SketchEngine is used by dictionary publishers (Oxford University Press,
Le Robert, Cornelsen, Collins, Macmillan etc) but is also useful for
individual Arabic language teachers and researchers.

Eric

Dear Sohaib,

I suggest you also contact your colleagues at Taibah University
in the College of Computer Science and Engineering, who are also
researching Arabic text analysis, particularly religious texts.
They may be interested in collaboration on Arabic text corpus analysis.
I have met Dr Mohamed Menacer of the NOOR research centre at Taibah
University, who is helping to organise a conference around this topic in
December 2013: International Conference on Advances in Information
Technology for the Holy Quran and Its Sciences.http://www.taibahu.edu.sa/**
pages.aspx?pid=11438&ln=en<http://www.taibahu.edu.sa/pages.aspx?pid=11438&ln=en>

He can be contacted at:

Dr Mohamed Menacer eazmm at hotmail.com
Department of Computer Science
College of Computer Science and Engineering, Taibah University,
P.O. Box 30002, Madinah Munawarrah,
Kingdom of Saudi Arabia
Mobile: +966-530943483


regards

Eric

--------------------------------------------------------------------------
2)
Date: 12 Sep 2013
From:  "Jiří Milička" <milicka at centrum.cz>
Subject: Arabic text analysis software response

Hello Sohaib
Try TypeTokener (milicka.cz/en/typetokener), it is a freeware.
Just provide it with names of files you want to process in plain txt format
(utf-8) (it can remove vocalisation if you wish) and it gives you set of
word types (=set of distinct words), rank-frequency relation ("Zipf law"),
number of word types, type-token relation (Herdan's/ Heaps' law) and
combinatorial model of the type-token relation which can help you to
discover inhomogeneities in the text.

Let me know if you met any problem.

Jiří Milička

--------------------------------------------------------------------------
3)
Date: 12 Sep 2013
From: hussein hiyassat <hiyassat at gmail.com>
Subject: Arabic text analysis software response

Please try cmu language tool kit

http://www.speech.cs.cmu.edu/SLM_info.html


--------------------------------------------------------------------------
End of Arabic-L: 12 Sep 2013
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/arabic-l/attachments/20130913/f8609856/attachment.htm>


More information about the Arabic-l mailing list