32.992, Confs: French; Text/Corpus Linguistics/Online

Thu Mar 18 16:02:30 UTC 2021

LINGUIST List: Vol-32-992. Thu Mar 18 2021. ISSN: 1069 - 4875.

Subject: 32.992, Confs: French; Text/Corpus Linguistics/Online

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Jeremy Coburn, Lauren Perkins
Managing Editor: Becca Morris
Team: Helen Aristar-Dry, Everett Green, Sarah Robinson, Nils Hjortnaes, Joshua Sims, Billy Dickson
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Lauren Perkins <lauren at linguistlist.org>
================================================================

Date: Thu, 18 Mar 2021 12:00:46
From: Michela Russo [mrusso at univ-paris8.fr]
Subject: Le Nouveau Corpus d’Amsterdam (NCA) et la Base de Français Médiéval (BFM) : états et perspectives philologiques et linguistiques/The New Amsterdam Corpus (NCA) and the Base de Français Médiéval (BFM): philological and linguistic status and perspectives

Le Nouveau Corpus d’Amsterdam (NCA) et la Base de Français Médiéval (BFM) : états et perspectives philologiques et linguistiques/The New Amsterdam Corpus (NCA) and the Base de Français Médiéval (BFM): philological and linguistic status and perspectives 

Date: 09-Apr-2021 - 09-Apr-2021 
Location: Paris/Lyon (Virtual conference), France 
Contact: Michela Russo 
Contact Email: mrusso at univ-paris8.fr 
Meeting URL: https://www.sfl.cnrs.fr/journee-detudes-vendredi-9-avril-2021-10h-17h 

Linguistic Field(s): Text/Corpus Linguistics 

Subject Language(s): French (fra)

Meeting Description: 

This workshop focuses on two medieval French corpora, the New Amsterdam Corpus
(NCA, 299 literary texts and text excerpts, including 57 prose texts),
accessible online (TWIC online research
https://sites.google.com/site/achimstein/research/resources/nca ) or by TXM in
local installation, and the Medieval French database (BFM, 170 texts)
accessible on the BFM-TXM textometric analysis portal
(http://txm.bfm-corpus.org), but also exploitable by TXM in local
installation.

The New Amsterdam Corpus (NCA), edited (revised and lemmatized) by Pierre
Kunstmann and Achim Stein, is the new version of the Amsterdam Corpus, a
corpus of Old French literary texts created in the early 1980s by Anthonij
Dees (Vrije Universiteit Amsterdam) and his collaborators (Piet van Reenen and
others). It resulted in the Atlas of the linguistic forms of literary texts of
Old French (Dees et al. 1987).

The forms of these texts were manually annotated by Dees’ team with a set of
numerical tags encoding parts of speech and other morphological categories.
Some texts are electronic versions of existing editions, others are
transcriptions of manuscripts made especially for this corpus. 

The aim of this workshop is to introduce the digital corpus of literary texts
of the New Amsterdam Corpus (NCA), the electronic version of the texts
provided by Piet van Reenen (Free University of Amsterdam), which contains
about 200 different texts written between the beginning of the twelfth and the
end of the fourteenth century (some of them in several manuscripts, giving a
total of 299 texts), its type of syntactic annotation, and its morphological
labeling.

Dees’team also had a corpus of 3300 local, dated original charters (collected
mainly by Anthonij Dees and Piet van Reenen). The result of this work was the
Atlas of Forms and Constructions of French Charters of the 13th Century (Dees
et al. 1980). Thanks to the Vrije Universiteit Amsterdam a large part of these
charters has been digitalized (in its grammatical parts, nominal groups,
pronominal groups, etc.). 

During this workshop, we will focus on the description of these 13th century
charters, Parisian and Anglo-Norman charters, and the Aube charters (made
available thanks to Piet van Reenen), and on their morphological annotation
(320,000 words, annotated with POS and numerical codes). 

As for the BFM, the Base de français médiéval, it has been located at the ENS
de Lyon since its inception. Founded in 1989 by Christiane Marchello-Nizia,
the BFM is currently managed by Céline Guillot-Barbance, scientific director,
and Alexei Lavrentiev, director of digital philology. It contains several
digital corpora of French texts written between the 9th and the end of the
15th century. The texts are annotated in morphosyntax, lemmatized and the
direct speech passages are encoded. Access to the BFM is open and is done
through the TXM textometric analysis platform, which offers several search and
analysis functionalities through word concordances and textual patterns, etc. 

The NCA and the BFM constitute two valuable resources for medieval French.

The French version is available here:
https://www.sfl.cnrs.fr/journee-detudes-vendredi-9-avril-2021-10h-17h

Program Information: 

See the scheduled program at :
https://www.sfl.cnrs.fr/journee-detudes-vendredi-9-avril-2021-10h-17h 

Organizers :  Michela Russo / Clémence Jaime / Céline Guillot-Barbance /
Alexei Lavrentiev

Conférenciers invités :  Achim Stein (Institut für Linguistik/Romanistik,
Universität Stuttgart)  &  Alexei Lavrentiev (ENS/ Lyon)

This workshop includes two sessions on medieval French and the digital
sources, open to master and doctoral students. All colleagues and students are
cordially invited to participate upon registration. 

Contact michela.russo at cnrs.fr & celine.guillot at ens-lyon.fr

Abstracts :  
Conference by Achim Stein (Institut für Linguistik/Romanistik, Universität
Stuttgart)
The New Amsterdam Corpus (NAC): origins, annotation and perspectives
In the first part of this conference, I will present the genesis of the oldest
digital corpus of medieval French, from the files established by Anthonij
Dees’ team at the Free University of Amsterdam in the 1980s to its re-edition
25 years later.  The second part will be devoted to the conversion of the
original data and the attempts and challenges of lemmatization. In the final
part, I will discuss the position that the NCA occupies today in the landscape
of ancient corpora and its usefulness from a philological and technical point
of view.

Conference by Alexei Lavrentiev & Céline Guillot-Barbance (IHRIM - CNRS & ENS
/ Lyon)  
The Medieval French Database (BFM= Base de français medieval) in 2021: current
status and ongoing developments
This conference/demonstration will focus on the lesser known features of the
Medieval French database (BFM = Base de français medieval). It will deal with
morphosyntactic labeling (Cattex and UD) and lemmatization (automatic and
verified), as well as quantitative analysis tools (progression, specificities,
factorial correspondence analysis, co-occurrences) provided by the TXM
application and not yet available on the online portal. The novelties of the
BFM 2021 corpus, scheduled for publication in June-July, will be presented as
a conclusion.

Session/Atelier 1
NCA under linguistic analysis. The example of partitivity in Old French (resp.
Achim Stein/Michela Russo/Clémence Jaime)
In this group students will work with the corpus features with the local
NCA/TXM installation using syntactic queries from the TigerSearch interface
implemented online for GRAAL on the BFM/TXM portal, to the diatopic
indications (area code, location used in the atlas) and the original
annotation of the Amsterdam Corpus. Achim Stein will show students the
differences between the results of manual analysis (with reference to the
SRCMF Syntactic Reference Corpus of Medieval French http://srcmf.org/) and
automatic analysis (of the NCA). He will also introduce the students to
automatic (dependent) syntactic analysis of Old French, by showing for example
a treebank and applying it to the NCA.
Clémence Jaime (student in M2 ''Linguistics and dialectology'' at UJM Lyon 3)
will illustrate from the online and local BFM/NCA/TXM interface (also through
regular expressions) ''The example of partitivity in Old French'', research
subject of her master thesis.
[Students are advised to install the TXM software:
http://textometrie.ens-lyon.fr/spip.php?rubrique61; the NCA
https://sites.google.com/site/achimstein/research/resources/nca as well as
TIGERSearch zip archive: nca3-for-tiger.zip]

Abstract Session/Atelier 2: (Resp. Alexei Lavrentiev/Zeina Tmart & Céline
Guillot-Barbance)
In this Session/Atelier, Zeina Tmart (PhD student at ENS Lyon) will present
her research project on the evolution of coordination between the 12th and
16th century French. The presentation will go from the conception of the
corpus to its annotation with TXM and the exploitation of the results. The
workshop will allow students to work on the annotation of concordances with
the TXM software. This feature allows to correct errors of automatic labeling
and annotation and to add additional annotations to the words of the corpus.

Join Zoom Meeting:
https://zoom.us/j/94018653392?pwd=TTZzbVNJQTlndk5DaTVYcU8wOGFnZz09
Meeting ID: 940 1865 3392
Passcode: SFL (One tap mobile: Passcode 396522)
Find your local number: https://zoom.us/u/aDZbg0NKx

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-32-992	
----------------------------------------------------------