29.4592, FYI: A New Corpus for French Studies
The LINGUIST List
linguist at listserv.linguistlist.org
Mon Nov 19 21:11:43 UTC 2018
LINGUIST List: Vol-29-4592. Mon Nov 19 2018. ISSN: 1069 - 4875.
Subject: 29.4592, FYI: A New Corpus for French Studies
Moderator: linguist at linguistlist.org (Malgorzata E. Cavar)
Reviews: reviews at linguistlist.org (Helen Aristar-Dry, Robert Coté)
Homepage: https://linguistlist.org
Please support the LL editors and operation with a donation at:
https://funddrive.linguistlist.org/donate/
Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================
Date: Mon, 19 Nov 2018 16:10:55
From: Jeanne-Marie Debaisieux [jeanne-marie.debaisieux at Sorbonne-Nouvelle.fr]
Subject: A New Corpus for French Studies
Outils et Ressources pour le Français Ecrit et Oral
Orfeo (Tools and resources for written and oral French) is a portal which
gives access to the Corpus for the Study of Contemporary French: (CEFC). The
corpus consists of 10 M. words:
- 4 million words from spoken French transcriptions of about XXX hours of
recordings, collected in France, Switzerland and Belgium and in different
diaphasic situations (face-to-face conversations; interviews, debates, and
classroom interactions; lectures, sermons, and speeches, as well as radio and
television programs).
- 6 million words of written texts from a wide range of genres (e.g.
literature, scientific texts, regional and national press, essays, academic,
non-standard writings).
- CEFC is freely available on the portal :
https://www.ortolang.fr/market/corpora/cefc-orfeo
- The portal gives access to the acoustic files and textual resources. The
corpus is searchable for textual and register variables available from the
metadata, as well as for lexical and morpho-syntactic (POS) annotations. The
entire corpus is further semi-automatically annotated with syntactic
dependencies. The search tool can return dependencies patterns. All the
queries return orthographic transcriptions aligned with audio files. Guides
are provided for all types of annotations. All files: texts, sounds and
annotations are freely downloadable.
Linguistic Field(s): Text/Corpus Linguistics
Subject Language(s): French (fra)
------------------------------------------------------------------------------
***************** LINGUIST List Support *****************
Please support the LL editors and operation with a donation at:
The IU Foundation Crowd Funding site:
https://iufoundation.fundly.com/the-linguist-list
The LINGUIST List FundDrive Page:
https://funddrive.linguistlist.org/donate/
----------------------------------------------------------
LINGUIST List: Vol-29-4592
----------------------------------------------------------
More information about the LINGUIST
mailing list