Description
In recent years artificial neural networks have become widely used in all
types of application and industries, specially to solve computer vision
and auditory tasks. However, they are not
as reliable as the accuracy of controlled tests could let us believe.
It is possible to do minimal perturbations to the input that lead to an
erroneous output of the network, this type of perturbations are called
adversarial attacks. Most recently, the distrust
from the general public and from some researchers has provoked the
development of explainable AI (xAI) methods, which aim to explain the
output of a neural network. On this thesis we leverage the power of xAI
methods to identify the source of an error generated
by an adversarial input. In particular, we exploit the work proposed in
Defense-GAN to correct the adversarial image in the context of an image
classification task. Through the work presented in this thesis, we show
that in most cases the classifier’s accuracy
is improved when provided with a defense technique as long as the input
contains adversarial noise. Furthermore, the decrease in accuracy is
minimal when facing unperturbed inputs. Nevertheless, the original
Defense-GAN still has a better performance protecting
from adversarial attacks in average when compared to our method (xAI
Defense-GAN).
|