29.4592, FYI: A New Corpus for French Studies

The LINGUIST List linguist at listserv.linguistlist.org
Mon Nov 19 21:11:43 UTC 2018


LINGUIST List: Vol-29-4592. Mon Nov 19 2018. ISSN: 1069 - 4875.

Subject: 29.4592, FYI: A New Corpus for French Studies

Moderator: linguist at linguistlist.org (Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté)
Homepage: https://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================


Date: Mon, 19 Nov 2018 16:10:55
From: Jeanne-Marie Debaisieux [jeanne-marie.debaisieux at Sorbonne-Nouvelle.fr]
Subject: A New Corpus for French Studies

 
Outils et Ressources pour le Français Ecrit et Oral
 
Orfeo (Tools and resources for written and oral French) is a portal which
gives access to the Corpus for the Study of Contemporary French: (CEFC). The
corpus consists of 10 M. words:

- 4 million words from spoken French transcriptions of about XXX hours of
recordings, collected in France, Switzerland and Belgium and in different
diaphasic situations (face-to-face conversations; interviews, debates, and
classroom interactions; lectures, sermons, and speeches, as well as radio and
television programs).
- 6 million words of written texts from a wide range of genres (e.g.
literature, scientific texts, regional and national press, essays, academic,
non-standard writings).
- CEFC is freely available on the portal : 
https://www.ortolang.fr/market/corpora/cefc-orfeo    
- The portal gives access to the acoustic files and textual resources. The
corpus is searchable for textual and register variables available from the
metadata, as well as for lexical and morpho-syntactic (POS) annotations. The
entire corpus is further semi-automatically annotated with syntactic
dependencies. The search tool can return dependencies patterns. All the
queries return orthographic transcriptions aligned with audio files. Guides
are provided for all types of annotations. All files: texts, sounds and
annotations are freely downloadable.
 



Linguistic Field(s): Text/Corpus Linguistics

Subject Language(s): French (fra)





 



------------------------------------------------------------------------------

*****************    LINGUIST List Support    *****************
Please support the LL editors and operation with a donation at:

              The IU Foundation Crowd Funding site:
       https://iufoundation.fundly.com/the-linguist-list

               The LINGUIST List FundDrive Page:
            https://funddrive.linguistlist.org/donate/
 


----------------------------------------------------------
LINGUIST List: Vol-29-4592	
----------------------------------------------------------






More information about the LINGUIST mailing list