Description
Automatic speech recognition (ASR) systems have been successfully integrated into
modern life through voice-controlled smart assistants. However, robust systems and
research into new methods of attack and defense are necessary to ensure the continued
adaptation of devices and safety in use. Robust systems must be able to filter noise without
compromising functionality and be secure against adversarial attacks simultaneously. The
term "noise robustness" in audio refers to the ability of a classifier to transcribe speech correctly,
even when perturbed by various noises. On the other hand, the term "adversarial robustness"
refers to the model's ability to withstand adversarial attacks. These attacks use audio input to
deceive the system's classifier without being noticed and thus provoke a miss-classification.
Previous research in the field of robustness in image classification shows that the relationship
between robustness to adversarial noise and general noise still needs to be made entirely clear.
This is especially true in speech recognition, for which few evaluations exist. The risk of ASR
systems being manipulated by third parties increases if systems can be deceived that are not
known to the attacker. This is precisely what portability attacks make possible. Adversarial
examples are tested on these, and if transferability can be established, these attacks are applied
to other systems. Our results show that it is impossible to perform transferability attacks based on
optimization attacks efficiently.
|