Annual Computer Security Applications Conference (ACSAC) 2023

Protecting Your Voice from Speech Synthesis Attacks

In recent years, much attention has been paid to speech synthesis, which aims to generate synthetic speeches in a voice of a target speaker. Although the speech synthesis technique has facilitated a wide spectrum of applications that positively impact our daily lives, it can also be used by attackers to perform speech synthesis attacks. An attacker can use this technique to mimic the voice of a victim and transform arbitrarily chosen text or voice samples into the same content spoken by the victim. To protect a speaker's voice from speech synthesis attacks, in this paper, we propose two novel defense schemes that can be used by the speaker to process his or her speeches before publishing them on social media platforms or sending them to others. The processed speeches cannot only significantly degrade the performance of speech synthesis systems but also keep the sound of the speaker's voice so that they can still be used for normal purposes. The desirable performance of the proposed defense schemes is verified through extensive experiments conducted on several real-world speaker recognition (SR) systems and a user study on a public crowdsourcing platform.

Zihao Liu
Iowa State University

Yan Zhang
Iowa State University

Chenglin Miao
Iowa State University

Paper (ACM DL)

Slides