NDSS

SurfingAttack: Interactive Hidden Attack on Voice Assistants Using Ultrasonic Guided Waves

Qiben Yan (Michigan State University), Kehai Liu (Chinese Academy of Sciences), Qin Zhou (University of Nebraska-Lincoln), Hanqing Guo (Michigan State University), Ning Zhang (Washington University in St. Louis)

With recent advances in artificial intelligence and natural language processing, voice has become a primary method for human-computer interaction, which has enabled game-changing new technologies in both commercial sector such as Siri, Alexa or Google Assistant and the military sector in voice-controlled naval warships. Recently, researchers have demonstrated that these voice assistant systems are susceptible to signal injection from voice commands at the inaudible frequency. To date, most of the existing work focus primarily on delivering a single command via line-of-sight ultrasound speaker and extending the range of this attack via speaker array. However, sound waves also propagate through other materials where vibration is possible. In this work, we aim to understand the characteristics of this new genre of attack in the context of different transmission media. Furthermore, by leveraging the unique properties of acoustic transmission in solid materials, we design a new attack called SurfingAttack that will allow multiple rounds of interactions with the voice-controlled device over a longer distance and without the need to be in line-of-sight, resulting in minimal change to the physical environment. This has greatly elevated the potential risk of inaudible sound attack, enabling many new attack scenarios, such as hijacking a mobile Short Message Service (SMS) passcode, making ghost fraud calls without owners' knowledge, etc. To accomplish SurfingAttack, we have solved several major challenges. First, the signal has been specially designed to allow omni-directional transmission for performing effective attacks over a solid medium. Second, the new attack enables two-way communication without alerting the legitimate user at the scene, which is challenging since the device is designed to interact with human in physical proximity rather than sensors. To mitigate this newly discovered threat, we also provide discussions and experimental results on potential countermeasures to defend against this new threat.