Description
Deep learning has witnessed a remarkable evolution in classification networks since their
breakthrough in 2012. These networks, however, remain susceptible to vulnerabilities, particularly
concerning adversarial attacks. In response, we introduce a novel approach using synthetic adversarial
examples. This methodology aims to fortify the model robustness by concurrently enhancing robustness
accuracy and reducing the generalization error of the model.
To produce these semantically similar adversarial examples, we leverage the potential of diffusion models.
These models have garnered interest owing to their proficiency in capturing data representations. They
yield outputs akin to input data, affording benefits in generating synthetic data on a per-class basis
and improving overall model generalization. Our approach involves employing adversarial examples generated
through the widely recognized Fast Gradient Sign Method, a white-box attack method, as input to the model
for generating synthetic adversarial examples.
We conducted a comparison between our approach and the traditional adversarial training, revealing an
enhancement in robustness accuracy. Specifically, we observed a notable increase of 10.95 percentage points
for benign test samples and a significant rise of 10.81 percentage points for adversarial test samples.
|