TUM Logo

Localized Shortcut Removal

Machine learning is a data-driven field, and the quality of the underlying datasets plays a crucial role in learning success. However, high performance on held-out test data does not necessarily indicate that a model generalizes or learns anything meaningful. This is often due to the existence of machine learning shortcuts - features in the data that are predictive but unrelated to the problem at hand. To address this issue for datasets where the shortcuts are smaller and more localized than true features, we propose a novel approach to detect and remove them. We use an adversarially trained lens to detect and eliminate highly predictive but semantically unconnected clues in images. In our experiments on both synthetic and real-world data, we show that our proposed approach reliably identifies and neutralizes such shortcuts without causing degradation of model performance on clean data. We believe that our approach can lead to more meaningful and generalizable machine learning models, especially in scenarios where the quality of the underlying datasets is crucial.

Localized Shortcut Removal

XAI4CV @ CVPR2023

Authors: Nicolas Mueller, Jochen Jacobs, Jennifer Williams, and Konstantin Böttinger
Year/month: 2023/5
Booktitle: XAI4CV @ CVPR2023
Fulltext: click here

Abstract

Machine learning is a data-driven field, and the quality of the underlying datasets plays a crucial role in learning success. However, high performance on held-out test data does not necessarily indicate that a model generalizes or learns anything meaningful. This is often due to the existence of machine learning shortcuts - features in the data that are predictive but unrelated to the problem at hand. To address this issue for datasets where the shortcuts are smaller and more localized than true features, we propose a novel approach to detect and remove them. We use an adversarially trained lens to detect and eliminate highly predictive but semantically unconnected clues in images. In our experiments on both synthetic and real-world data, we show that our proposed approach reliably identifies and neutralizes such shortcuts without causing degradation of model performance on clean data. We believe that our approach can lead to more meaningful and generalizable machine learning models, especially in scenarios where the quality of the underlying datasets is crucial.

Bibtex:

@inproceedings {
author = { Nicolas Mueller and Jochen Jacobs and Jennifer Williams and Konstantin Böttinger},
title = { Localized Shortcut Removal },
year = { 2023 },
month = { May },
booktitle = { XAI4CV @ CVPR2023 },
url = { https://doi.org/10.48550/arXiv.2211.15510 },

}