ACSAC2016 Program

Full Program »

Deep learning in a collaborative setting is emerging as a corner- stone of many upcoming applications, wherein untrusted users collaborate to generate more accurate models. From the security perspective, this opens collaborative deep learning to poisoning at- tacks, wherein adversarial users deliberately alter their inputs to mis-train the model. These attacks are known for machine learning systems in general, but their impact on new deep learning systems is not well-established.

We investigate the setting of indirect collaborative deep learning — a form of practical deep learning wherein users submit masked features rather than direct data. Indirect collaborative deep learning is preferred over direct, because it distributes the cost of computation and can be made privacy-preserving. In this paper, we study the susceptibility of collaborative deep learning systems to adversarial poisoning attacks. Specifically, we obtain the follow- ing empirical results on 2 popular datasets for handwritten images (MNIST) and traffic signs (GTSRB) used in auto-driving cars. For collaborative deep learning systems, we demonstrate that the at- tacks have 99% success rate for misclassifying specific target data while poisoning only 10% of the entire training dataset.

As a defense, we propose AUROR, a system that detects malicious users and generates an accurate model. The accuracy un- der the deployed defense on practical datasets is nearly unchanged when operating in the absence of attacks. The accuracy of a model trained using AUROR drops by only 3% even when 30% of all the users are adversarial. AUROR provides a strong guarantee against evasion; if the attacker tries to evade, its attack effectiveness is bounded.

Author(s):

Shiqi Shen
National University of Singapore
Singapore

Shruti Tople
National University of Singapore
Singapore

Prateek Saxena
National University of Singapore
Singapore