Feature visualization is used to visualize learned features for black box
machine learning models. Our approach explores an altered training process to
improve interpretability of the visualizations. We argue that by using
background removal techniques as a form of robust training, a network is forced
to learn more human recognizable features, namely, by focusing on the main
object of interest without any distractions from the background. Four different
training methods were used to verify this hypothesis. The first used unmodified
pictures. The second used a black background. The third utilized Gaussian noise
as the background. The fourth approach employed a mix of background removed
images and unmodified images. The feature visualization results show that the
background removed images reveal a significant improvement over the baseline
model. These new results displayed easily recognizable features from their
respective classes, unlike the model trained on unmodified data