Pause
Lecture
Moteur de recherche d'offres d'emploi CEA

Development of algorithms to prevent information leakage in the federated learning setting H/F


Détail de l'offre

Informations générales

Entité de rattachement

Le Commissariat à l'énergie atomique et aux énergies alternatives (CEA) est un organisme public de recherche.

Acteur majeur de la recherche, du développement et de l'innovation, le CEA intervient dans le cadre de ses quatre missions :
. la défense et la sécurité
. l'énergie nucléaire (fission et fusion)
. la recherche technologique pour l'industrie
. la recherche fondamentale (sciences de la matière et sciences de la vie).

Avec ses 16000 salariés -techniciens, ingénieurs, chercheurs, et personnel en soutien à la recherche- le CEA participe à de nombreux projets de collaboration aux côtés de ses partenaires académiques et industriels.  

Référence

2021-19232  

Description de la Direction

Located in Saclay, south of Paris, CEA List (http://www-list.cea.fr/) is a scientific and technological research center dedicated to the development of software, embedded systems and sensors for applications such as defense, security, energy, nuclear power, environment and health. CEA LIST is part of the dynamic and stimulating ecosystem of the University of Paris-Saclay - the largest French scientific center with 60,000 students (now #13 in world Shanghai ranking). It has more than 700 researchers focusing on intelligent digital systems.

Description de l'unité

Within the List institute, the SID (Data Intelligence Service) works on algorithms and methodologies for artificial intelligence and signal processing. The laboratory's research and technological advances are guided by a variety of applications, for which the specificities and constraints on the data or the execution environment require a fine design of AIs and their integration as the unitary bricks of complex systems.

Description du poste

Domaine

Systèmes d'information

Contrat

CDD

Intitulé de l'offre

Development of algorithms to prevent information leakage in the federated learning setting H/F

Statut du poste

Cadre

Durée du contrat (en mois)

24

Description de l'offre

Federated learning (FL) is a new machine learning (ML) paradigm [1] allowing to build models in a collaborative and decentralized way. Opposite to traditional server-side approaches that aggregate data on a central server for training, FL leaves the training data distributed on the end user devices while learning a shared model by aggregating only ephemeral locally-computed updates. Such a decentralized optimization procedure helps to ensure data privacy and reduces communication costs as the data remains in its original location. However, FL still faces several key challenges [2] about statistical heterogeneity (dealing with non-iid data), systems heterogeneity (dealing with different devices in terms of hardware, network connectivity and power), security (ensuring the robustness to attacks and failures) and privacy (preventing information disclosure about individuals).

 

The candidate will address the problem of privacy. FL intrinsically protects the data stored on each device by sharing model updates, instead of the original data. It is a solution for data privacy but not model privacy. Even if model parameters contain significantly less information about the users compared to the raw data, it is sometimes possible to infer sensitive information from a ML model [3]. In the FL setting, the threat also come from the participants (i.e. clients and server) that can try to reconstruct information about individuals deploying GAN throughout the training process [4;5].

The candidate should study, implement and evaluate such attacks. Then, defenses should be proposed and developed in order to prevent information leakage against the other clients (e.g. differential privacy [6]) and the central server (e.g. secure aggregation [7]).

 

The candidate will join a collaborative project aiming to segment medical images in a decentralized way, working both with CEA List and an industrial partner, a multinational group leader in its domain, located near Paris. The applicant should have excellent skills in both machine learning and Python. She/he should enjoy working as part of a team in a research environment.

[1] Google AI blog: https://ai.googleblog.com/2017/04/federated-learning-collaborative.html
[2] P. Kairouz et al. ‘Advances and Open Problems in Federated Learning’, arXiv, Dec. 2019.
[3] M. Fredrikson, S. Jha, and T. Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security, p. 1322–1333. ACM, 2015.
[4] B. Hitaj, G. Ateniese, and F. Perez-Cruz. ‘Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning’, arXiv, Sep. 2017.


To apply send an updated CV and a motivation letter to :

Aurélien Mayoue (aurelien.mayoue@cea.fr)

Jérôme Gauthier (jerome.gauthier@cea.fr)

Profil du candidat

1.    PhD in machine learning from an accredited university

2.    Excellent skills in Python

3.    Excellent communication skills, both verbal and written in English and French.

4.    Experience in federated learning is a plus

5.    Experience in distributed systems and/or robust attacks is a plus

Localisation du poste

Site

Saclay

Localisation du poste

France, Ile-de-France, Essonne (91)

Ville

Saclay

Demandeur

Disponibilité du poste

03/01/2022