Privacy-Preserving Collaborative Machine Learning?

Wednesday February 15, 2023 - 14.30 (GMT+1)

Dario Pasquini

EPFL Lausanne



Deep learning requires massive data sets to reach peak performance. Unfortunately, collecting suitable data sets is difficult or sometimes impossible. Entities and organizations may not be willing to share their internal data for fear of releasing sensitive information. For instance, telecommunication companies would benefit extraordinarily from deep learning techniques but do not wish to release customer data to their competitors. Similarly, medical institutions cannot share information because privacy laws and regulations shelter patient data.

It is here that collaborative machine learning comes to our rescue. Collaborative machine learning protocols such as “Federated Learning” enable a set of parties to train a machine learning model in a distributed fashion. In the process, users’ private data remains local on their devices, ensuring that no private information is leaked, right? Well, that may not be strictly true. This talk is about misconceptions, inaccurate assumptions, unrealistic trust models, and flawed methodologies affecting current collaborative machine learning techniques. In the presentation, we cover different security issues concerning both emerging approaches and well-established solutions in the privacy-preserving collaborative machine learning landscape. The objective here is to highlight the general errors and flawed approaches we all should avoid in devising, implementing, and using "privacy-preserving collaborative machine learning”.