33.1255, Calls: Computational Linguistics/France

Fri Apr 8 06:31:36 UTC 2022

LINGUIST List: Vol-33-1255. Fri Apr 08 2022. ISSN: 1069 - 4875.

Subject: 33.1255, Calls: Computational Linguistics/France

Moderator: Malgorzata E. Cavar (linguist at linguistlist.org)
Student Moderator: Billy Dickson
Managing Editor: Lauren Perkins
Team: Helen Aristar-Dry, Everett Green, Sarah Goldfinch, Nils Hjortnaes,
      Joshua Sims, Billy Dickson, Amalia Robinson, Matthew Fort
Jobs: jobs at linguistlist.org | Conferences: callconf at linguistlist.org | Pubs: pubs at linguistlist.org

Homepage: http://linguistlist.org

Please support the LL editors and operation with a donation at:
           https://funddrive.linguistlist.org/donate/

Editor for this issue: Everett Green <everett at linguistlist.org>
================================================================

Date: Fri, 08 Apr 2022 02:27:58
From: Victoria Arranz [arranz at elda.org]
Subject: Multilingual De-Identification of (Sensitive) Language Resources

Full Title: Multilingual De-Identification of (Sensitive) Language Resources 
Short Title: MDLR 

Date: 20-Jun-2022 - 20-Jun-2022
Location: Marseilles, France 
Contact Person: Victoria Arranz
Meeting Email: arranz at elda.org
Web Site: https://sites.google.com/vicomtech.org/multilingual-de-identification 

Linguistic Field(s): Computational Linguistics 

Call Deadline: 10-Apr-2022 

Meeting Description:

This workshop is organised by members of the MAPA project, funded by the EU
Connecting Europe Facility (CEF) program (https://mapa-project.eu/).
This project has developed a toolkit for the de-identification of texts in the
medical and legal fields which addresses all EU official languages. It has
followed a BERT-based Named Entity Recognition approach for personal
information identification. A wide range of topics have been considered and
are hot topics open for discussion to all participants of this workshop. Among
them, we have the following:

1.  Sensitive personal information, domains and services that require
de-identification
2.  Corpora annotation and/or creation
3.  Annotation guidelines and platforms
4.  De-identification tools, data and/or applications
5.  De-identification and minority languages
6.  Multi-domain and/or multilingual processing
7.  NLP techniques and tools used for de-identification
8.  Multimodal de-identification
9.  Validation and benchmarking of de-identified resources
10. Evaluation of de-identification tools and applications
11. Evaluation protocols: how to evaluate, metrics, approaches, data,
experiences
12. Best practices
13. Approaches, activities and systems addressing “anonymization” are also
welcome to share their experience.
14. Any other topic related to de-identification

This workshop will also be a good forum to discuss the possibility to design
and initiate a new (annual) Challenge (evaluation campaign) on this important
topic.
We invite submissions for full papers and system demonstrations that address
these questions and other related issues relevant to the workshop.

Call for Papers:

Workshop on Multilingual De-Identification of (Sensitive) Language Resources

To be held in conjunction with the 13th International Language Resources and
Evaluation Conference (LREC 2022)
20 June 2022, Le Palais du Pharo, Marseille, France
https://sites.google.com/vicomtech.org/multilingual-de-identification

Deadline for submission: 10 April 2022

Description:

The General Data Protection Regulation (GDPR - Regulation (EU) 2016/679 of the
European Parliament and of the Council of 27 April 2016) ensures the
protection of natural persons with regard to the processing of personal data
and on the free movement of such data. The GDPR outlines a specific set of
rules that protect citizens and user data and create transparency in
information sharing. GDPR is the strictest data privacy regulation in the
world, and considerable work is taking place to develop techniques and deploy
systems that help comply with this regulation while rendering data accessible
and, thus, usable for further processing.
Different techniques are studied to guarantee such compliance, implying
different levels of sensitive content protection and with a short- or
long-term guarantee depending on whether we may have access to additional
related information. In this regard, we can read about work on anonymization,
de-identification and pseudonymization. While anonymization implies a zero
re-identification risk, which is extremely difficult to secure,
de-identification and pseudonymization represent an attainable target under
the GDPR, given that this regulation defines pseudonymization as “the
processing of personal data in such a manner that the personal data can no
longer be attributed to a specific data subject without the use of additional
information, provided that such additional information is kept separately and
is subject to technical and organisational measures to ensure that the
personal data are not attributed to an identified or identifiable natural
person.”
Bearing this context in mind, multilingual approaches and kits for (sensitive)
language resources de-identification may provide the means to share language
data while also protecting private or sensitive data by spotting then
deleting, obfuscating, pseudonymizing or encrypting person identifying
information.

De-identification is typically performed for the purpose of protecting an
individual’s private ctivities while maintaining the usefulness of the
gathered data for research and development purposes. This workshop aims at
discussing the various approaches to effective and reliable text
de-identification, focusing on some sensitive domains such as the medical and
legal domains, but not only.

Full CfP:
https://sites.google.com/vicomtech.org/multilingual-de-identification/call-for
-papers?authuser=0

------------------------------------------------------------------------------

***************************    LINGUIST List Support    ***************************
 The 2020 Fund Drive is under way! Please visit https://funddrive.linguistlist.org
  to find out how to donate and check how your university, country or discipline
     ranks in the fund drive challenges. Or go directly to the donation site:
                   https://crowdfunding.iu.edu/the-linguist-list

                        Let's make this a short fund drive!
                Please feel free to share the link to our campaign:
                    https://funddrive.linguistlist.org/donate/

----------------------------------------------------------
LINGUIST List: Vol-33-1255	
----------------------------------------------------------