2,728 research outputs found
Improving Unsupervised Defect Segmentation by Applying Structural Similarity to Autoencoders
Convolutional autoencoders have emerged as popular methods for unsupervised
defect segmentation on image data. Most commonly, this task is performed by
thresholding a pixel-wise reconstruction error based on an distance.
This procedure, however, leads to large residuals whenever the reconstruction
encompasses slight localization inaccuracies around edges. It also fails to
reveal defective regions that have been visually altered when intensity values
stay roughly consistent. We show that these problems prevent these approaches
from being applied to complex real-world scenarios and that it cannot be easily
avoided by employing more elaborate architectures such as variational or
feature matching autoencoders. We propose to use a perceptual loss function
based on structural similarity which examines inter-dependencies between local
image regions, taking into account luminance, contrast and structural
information, instead of simply comparing single pixel values. It achieves
significant performance gains on a challenging real-world dataset of
nanofibrous materials and a novel dataset of two woven fabrics over the state
of the art approaches for unsupervised defect segmentation that use pixel-wise
reconstruction error metrics
Variance Loss in Variational Autoencoders
In this article, we highlight what appears to be major issue of Variational
Autoencoders, evinced from an extensive experimentation with different network
architectures and datasets: the variance of generated data is significantly
lower than that of training data. Since generative models are usually evaluated
with metrics such as the Frechet Inception Distance (FID) that compare the
distributions of (features of) real versus generated images, the variance loss
typically results in degraded scores. This problem is particularly relevant in
a two stage setting, where we use a second VAE to sample in the latent space of
the first VAE. The minor variance creates a mismatch between the actual
distribution of latent variables and those generated by the second VAE, that
hinders the beneficial effects of the second stage. Renormalizing the output of
the second VAE towards the expected normal spherical distribution, we obtain a
sudden burst in the quality of generated samples, as also testified in terms of
FID.Comment: Article accepted at the Sixth International Conference on Machine
Learning, Optimization, and Data Science. July 19-23, 2020 - Certosa di
Pontignano, Siena, Ital
- …