[Corpora-List] searchable Arabic spontaneous oral MSA corpus

Hewitt, Stephen s.hewitt at unesco.org
Thu Dec 1 12:57:46 UTC 2011


Dear all,

I am looking for a searchable corpus of spontaneous oral MSA (Modern Standard Arabic).

I am particularly interested in instances of what I call “faulty accusatives”, cf:

wāħid: ’an yakūn ladēk hadafan “One: that you should have an objective” – Muħammad Ħasanain Haikal, Ma‘a Haikal (With Haikal), Al-Jazeera, 2008.03.20.

In other words, examples of erroneous use (according to Arabic grammatical tradition) of the accusative indefinite –an, usually instead of nominative indefinite –un (most often elided).

What can be observed in spontaneous production of MSA appears to correspond very closely to what is known as “syntactic direct object [initial consonant] mutation” in Welsh, which in fact covers rather more than just indefinite direct objects. Quite a lot has been written on Welsh syntactic mutation in recent years, with various explanations for the observable instances, not all of which are covered by traditional grammars.

However, as far as I know, nothing has been written on such “faulty accusatives” in spontaneous MSA; the assumption appears to be that since there are technically no native speakers, there can be no reliable linguistic analysis of such “slips” – they are just random mistakes, not worth analysing.

I am not convinced that that is the case; I believe that some users of MSA achieve near-native fluency, and hence develop their own internal grammar, which may not coincide on all points with the formal traditional grammar of fuṣḥà. Such error patterns thus become significant, revealing something about the speaker’s internal grammar.

Can anyone help me to find a reliable and searchable corpus of spontaneous oral MSA Arabic (either in Arabic script or in transcription) which has not been edited for such “mistakes”. Al-Jazeera post transcripts of some of their live talk shows in which numerous non-MSA items (šū, mā fīš, dil-wa’ti, etc.) are faithfully reproduced, but I am not certain that wāħid: ’an yakūn ladēk hadafan would not be edited to a more “correct” wāħid: ’an yakūn ladēk hadaf.

Many thanks,

Steve Hewitt

s.hewitt at unesco.org <mailto:s.hewitt at unesco.org>  

 




Getting to Zero : Zero New HIV Infections. Zero Discrimination and Zero AIDS Related Deaths

Objectif zéro : Zéro nouvelle infection à VIH. Zéro discrimination. Zéro décès lié au SIDA

WORLD AIDS DAY 2011
http://www.unesco.org/aids
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listserv.linguistlist.org/pipermail/corpora/attachments/20111201/9aeb7f00/attachment.htm>
-------------- next part --------------
_______________________________________________
UNSUBSCRIBE from this page: http://mailman.uib.no/options/corpora
Corpora mailing list
Corpora at uib.no
http://mailman.uib.no/listinfo/corpora


More information about the Corpora mailing list