TUM Logo

Comparing Robustness Notions in ASR Systems

Comparing Robustness Notions in ASR Systems

Supervisor(s): Karla Markert, Chingyu Kao
Status: finished
Topic: Others
Author: Tarik Özkahraman
Submission: 2022-12-15
Type of Thesis: Bachelorthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching

Description

Automatic speech recognition (ASR) systems have been successfully integrated into
modern life through voice-controlled smart assistants. However, robust systems and 
research into new methods of attack and defense are necessary to ensure the continued 
adaptation of devices and safety in use. Robust systems must be able to filter noise without 
compromising functionality and be secure against adversarial attacks simultaneously. The 
term "noise robustness" in audio refers to the ability of a classifier to transcribe speech correctly, 
even when perturbed by various noises. On the other hand, the term "adversarial robustness" 
refers to the model's ability to withstand adversarial attacks. These attacks use audio input to 
deceive the system's classifier without being noticed and thus provoke a miss-classification. 
Previous research in the field of robustness in image classification shows that the relationship 
between robustness to adversarial noise and general noise still needs to be made entirely clear. 
This is especially true in speech recognition, for which few evaluations exist. The risk of ASR
systems being manipulated by third parties increases if systems can be deceived that are not 
known to the attacker. This is precisely what portability attacks make possible. Adversarial
examples are tested on these, and if transferability can be established, these attacks are applied 
to other systems. Our results show that it is impossible to perform transferability attacks based on 
optimization attacks efficiently.