Mitigating Membership Inference Attacks by Weighted Smoothing
Membership inference attacks (MIA) are a known threat to neural network models which compromise privacy. Existing state-of-the-art approaches to mitigating MIA are based on the notion of differential privacy. While existing approaches based on differential privacy provides sound theoretical guarantees, they often suffer from significant accuracy drop. In this work, we discover the correlation of class diversity with the effectiveness of MIA, which sheds light on the root cause of MIA. Based on the intrinsic cause, we design an approach (termed weighted smoothing) that selectively adds noise to the training samples based on the sample distribution within the class. The experimental results show that our approach outperforms existing approaches significantly, i.e., reducing MIA effectiveness significantly whilst maintaining accuracy. In particular, our approach reduces the effectiveness of two state-of-the-art MIA to close to random guesses (i.e., 0.5 success rate) with nearly 0 accuracy loss.