Full Program »
Learning from Failures: Secure and Fault-Tolerant Aggregation for Federated Learning
Federated learning allows multiple parties to collaboratively train a global machine learning (ML) model without sharing their private datasets. To make sure that these local datasets are not leaked, existing works propose to rely on a secure aggregation scheme that allows parties to encrypt their model updates before sending them to the central server that aggregate the encrypted inputs. In this work, we design and evaluate a new secure and fault-tolerant aggregation scheme for federated learning that is robust against client failures. We first propose a threshold-variant of the secure aggregation scheme proposed by Joye and Libert. Using this new building block together with a dedicated decentralized key management scheme and a dedicated input encoding solution, we design a privacy-preserving federated learning protocol that, when executed among n clients, can recover from up to n/3 failures. Our solution is secure against a malicious aggregator who can manipulate messages to learn clients' individual inputs. We show that our solution outperforms the state of the art fault-tolerant secure aggregation schemes in terms of computation cost on both the client and the server sides. For example, with a ML model of 100,000 parameters, trained with 600 clients, our protocol is 5.5x faster (1.6x faster in case of 180 clients drop) at the client and 1.3x faster at the server.