27.4121, FYI: Spoken Corpus of Cameroon Pidgin English

The LINGUIST List via LINGUIST linguist at listserv.linguistlist.org
Fri Oct 14 15:49:06 UTC 2016


LINGUIST List: Vol-27-4121. Fri Oct 14 2016. ISSN: 1069 - 4875.

Subject: 27.4121, FYI: Spoken Corpus of Cameroon Pidgin English

Moderators: linguist at linguistlist.org (Damir Cavar, Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Anthony Aristar, Helen Aristar-Dry,
                                   Robert Coté, Michael Czerniakowski)
Homepage: http://linguistlist.org

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
                   25 years of LINGUIST List!
Please support the LL editors and operation with a donation at:
           http://funddrive.linguistlist.org/donate/

Editor for this issue: Kenneth Steimel <ken at linguistlist.org>
================================================================


Date: Fri, 14 Oct 2016 11:48:44
From: Gabriel Ozon [g.ozon at sheffield.ac.uk]
Subject: Spoken Corpus of Cameroon Pidgin English

 
We would like to announce the release of the Spoken Corpus of Cameroon Pidgin
English, a pilot corpus consisting of 240,000 words of spoken Cameroon Pidgin
English (CPE), a widely-used yet stigmatised and largely uncodified
pidgin/creole variety. This project was funded by a British Academy/Leverhulme
grant (ref. SG140663)

The corpus consists of 80 .wav format sound recordings of private and public
dialogues and monologues, each approximately 10-15 minutes in length. The
recordings were conducted in five different locations in Cameroon (Bamenda,
Buea, Douala, Kumba and Yaounde). Each sound file has two corresponding
transcriptions (each around 3,000 words in length), one with mark-up only and
the other with mark-up and POS-tagging. 

Text categories and the proportions of monologue and dialogue are guided by
those of the International Corpus of English (ICE) project, which makes the
corpus immediately comparable with existing corpora of post-colonial varieties
of English. 

The corpus, which is freely accessible as a resource for linguistic
description and comparison, is available at the Oxford Text Archive:
http://ota.ox.ac.uk/desc/2563 .

The accompanying documentation includes a list of participant data, a tagging
guide and a word list/spelling guide. 

Following successful completion of this pilot project, funding is currently
being sought for the compilation of a larger (1M word) corpus of CPE.

Melanie Green (University of Sussex), Miriam Ayafor (University of Yaounde I),
Gabriel Ozón, (University of Sheffield)
 



Linguistic Field(s): General Linguistics
                     Language Documentation
                     Phonetics
                     Phonology
                     Syntax
                     Text/Corpus Linguistics

Subject Language(s): English (eng)
                     French (fra)





 



------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
                       Fund Drive 2016
Please support the LL editors and operation with a donation at:
            http://funddrive.linguistlist.org/donate/

        Thank you very much for your support of LINGUIST!
 


----------------------------------------------------------
LINGUIST List: Vol-27-4121	
----------------------------------------------------------
Visit LL's Multitree project for over 1000 trees dynamically generated
from scholarly hypotheses about language relationships:
          http://multitree.org/








More information about the LINGUIST mailing list