[Corpora-List] searchable Arabic spontaneous oral MSA corpus

Emad Mohamed emohamed at umail.iu.edu
Thu Dec 1 20:07:01 UTC 2011


Hello Stephen,
I got interested in this structure a while back, but got busy with other
work.
After reading your email, I decided to try and discover whether this
structure occurs in more formal discourse, and I searched a small portion
of the Arabic Gigaword corpus using a simple, and naive, script that still
needs a lot of improvement.

I have found three examples in the portion I examined. The script is
attached. It is based on the idea that the structure occurs when we use kAn
followed by PREP+NOUN.

On Thu, Dec 1, 2011 at 3:57 PM, Hewitt, Stephen <s.hewitt at unesco.org> wrote:

>  *Dear all,*
>
> *I am looking for a searchable corpus of spontaneous oral MSA (Modern
> Standard Arabic).*
>
> *I am particularly interested in instances of what I call “faulty
> accusatives”, cf:*
>
> *wāħid: ’an yakūn ladēk hadaf**an* “One: that you should have an
> objective” – Muħammad Ħasanain Haikal, *Ma‘a Haikal (With Haikal)*,
> Al-Jazeera, 2008.03.20.****
>
> *In other words, examples of erroneous use (according to Arabic
> grammatical tradition) of the accusative indefinite **–an*, usually
> instead of nominative indefinite *–un* (most often elided).****
>
> *What can be observed in spontaneous production of MSA appears to
> correspond very closely to what is known as “syntactic direct object
> [initial consonant] mutation” in Welsh, which in fact covers rather more
> than just indefinite direct objects. Quite a lot has been written on Welsh
> syntactic mutation in recent years, with various explanations for the
> observable instances, not all of which are covered by traditional grammars.
> *
>
> *However, as far as I know, nothing has been written on such “faulty
> accusatives” in spontaneous MSA; the assumption appears to be that since
> there are technically no native speakers, there can be no reliable
> linguistic analysis of such “slips” – they are just random mistakes, not
> worth analysing.*
>
> *I am not convinced that that is the case; I believe that some users of
> MSA achieve near-native fluency, and hence develop their own internal
> grammar, which may not coincide on all points with the formal traditional
> grammar of fuṣḥà. Such error patterns thus become significant, revealing
> something about the speaker’s internal grammar.*
>
> *Can anyone help me to find a reliable and searchable corpus of
> spontaneous oral MSA Arabic (either in Arabic script or in transcription)
> which has not been edited for such “mistakes”.* Al-Jazeera post
> transcripts of some of their live talk shows in which numerous non-MSA
> items (*šū, mā fīš, dil-wa’ti*, etc.) are faithfully reproduced, but I am
> not certain that *wāħid: ’an yakūn ladēk hadaf**an* would not be edited
> to a more “correct” *wāħid: ’an yakūn ladēk hadaf*.****
>
> *Many thanks,*
>
> *Steve Hewitt*
>
> *s.hewitt at unesco.org *
>
> ** **
>
>
>
> Getting to Zero : Zero New HIV Infections. Zero Discrimination and Zero
> AIDS Related Deaths
>
> Objectif zéro : Zéro nouvelle infection à VIH. Zéro discrimination. Zéro
> décès lié au SIDA
>
> WORLD AIDS DAY 2011
> http://www.unesco.org/aids
>
> _______________________________________________
> UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
> Corpora mailing list
> Corpora at uib.no
> http://mailman.uib.no/listinfo/corpora
>
>


-- 
Emad Soliman Ali Mohamed
aka Emad Nawfal (*عماد نوفل*)
PhD in Linguistics, Computational Linguistics Track,
Department of Linguistics,
Indiana University, Bloomington
http://jones.ling.indiana.edu/~emadnawfal
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111201/1e7b0c1d/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: findFalseAccusatives.py
Type: text/x-python
Size: 922 bytes
Desc: not available
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111201/1e7b0c1d/attachment-0001.py>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list