9 research outputs found
Anomaly Detection for imbalanced datasets with Deep Generative Models
Many important data analysis applications present with severely imbalanced
datasets with respect to the target variable. A typical example is medical
image analysis, where positive samples are scarce, while performance is
commonly estimated against the correct detection of these positive examples. We
approach this challenge by formulating the problem as anomaly detection with
generative models. We train a generative model without supervision on the
`negative' (common) datapoints and use this model to estimate the likelihood of
unseen data. A successful model allows us to detect the `positive' case as low
likelihood datapoints.
In this position paper, we present the use of state-of-the-art deep
generative models (GAN and VAE) for the estimation of a likelihood of the data.
Our results show that on the one hand both GANs and VAEs are able to separate
the `positive' and `negative' samples in the MNIST case. On the other hand, for
the NLST case, neither GANs nor VAEs were able to capture the complexity of the
data and discriminate anomalies at the level that this task requires. These
results show that even though there are a number of successes presented in the
literature for using generative models in similar applications, there remain
further challenges for broad successful implementation.Comment: 15 pages, 13 figures, accepted by Benelearn 2018 conferenc
Quantifying and Learning Disentangled Representations with Limited Supervision
Learning low-dimensional representations that disentangle the underlying
factors of variation in data has been posited as an important step towards
interpretable machine learning with good generalization. To address the fact
that there is no consensus on what disentanglement entails, Higgins et al.
(2018) propose a formal definition for Linear Symmetry-Based Disentanglement,
or LSBD, arguing that underlying real-world transformations give exploitable
structure to data.
Although several works focus on learning LSBD representations, such methods
require supervision on the underlying transformations for the entire dataset,
and cannot deal with unlabeled data. Moreover, none of these works provide a
metric to quantify LSBD.
We propose a metric to quantify LSBD representations that is easy to compute
under certain well-defined assumptions. Furthermore, we present a method that
can leverage unlabeled data, such that LSBD representations can be learned with
limited supervision on transformations. Using our LSBD metric, our results show
that limited supervision is indeed sufficient to learn LSBD representations
Anomaly detection for visual quality control of 3D-printed products
We present a method for detection of surface defects in images of 3D-printed products that enables automated visual quality control. The data characterising this problem is typically high-dimensional (high-resolution images), imbalanced (defects are relatively rare), and has few labelled examples. We approach these challenges by formulating the problem as probabilistic anomaly detection, where we use Variational Autoencoders (VAE) to estimate the probability density of non-faulty products. We train the VAE in an unsupervised manner on images of non-faulty products only. A successful model will then assign high likelihood to unseen images of non-faulty products, and lower likelihood to images displaying defects.We test this method on anomaly detection scenarios using the MNIST dataset, as well as on images of 3D-printed products. The demonstrated performance is related to the capability of the model to closely estimate the density distribution of the non-faulty (expected) data. For both datasets we present empirical results that the likelihood estimated with a convolutional VAE can separate the normal and anomalous data. Moreover we show how the reconstruction capabilities of VAEs are highly informative for human observers towards localising potential anomalies, which can aid the quality control process
Anomaly detection for imbalanced datasets with deep generative models
Many important data analysis applications present with severely imbalanced datasets with respect to the target variable. A typical example is medical image analysis, where positive samples are scarce, while performance is commonly estimated against the correct detection of these positive examples. We approach this challenge by formulating the problem as anomaly detection with generative models. We train a generative model without supervision on the ‘negative’ (common) datapoints and use this model to estimate the likelihood of unseen data. A successful model allows us to detect the ‘positive’ case as low likelihood\u3cbr/\u3edatapoints.\u3cbr/\u3eIn this position paper, we present the use of state-of-the-art deep generative models (GAN and VAE) for the estimation of a likelihood of the data. Our results show that on the one hand both GANs and VAEs are able to separate the ‘positive’ and ‘negative’ samples in the MNIST case. On the other hand, for the NLST case, neither GANs nor VAEs were able to capture the complexity of the data and discriminate anomalies at the level that this task requires. These results show that even though there are a number of successes presented in the literature for using generative models in similar applications, there remain further challenges for broad successful implementation
Anomaly detection for visual quality control of 3D-printed products
\u3cp\u3eWe present a method for detection of surface defects in images of 3D-printed products that enables automated visual quality control. The data characterising this problem is typically high-dimensional (high-resolution images), imbalanced (defects are relatively rare), and has few labelled examples. We approach these challenges by formulating the problem as probabilistic anomaly detection, where we use Variational Autoencoders (VAE) to estimate the probability density of non-faulty products. We train the VAE in an unsupervised manner on images of non-faulty products only. A successful model will then assign high likelihood to unseen images of non-faulty products, and lower likelihood to images displaying defects.We test this method on anomaly detection scenarios using the MNIST dataset, as well as on images of 3D-printed products. The demonstrated performance is related to the capability of the model to closely estimate the density distribution of the non-faulty (expected) data. For both datasets we present empirical results that the likelihood estimated with a convolutional VAE can separate the normal and anomalous data. Moreover we show how the reconstruction capabilities of VAEs are highly informative for human observers towards localising potential anomalies, which can aid the quality control process.\u3c/p\u3
SHREC 2021: Retrieval of cultural heritage objects
This paper presents the methods and results of the SHREC’21 track on a dataset of cultural heritage (CH) objects. We present a dataset of 938 scanned models that have varied geometry and artistic styles. For the competition, we propose two challenges: the retrieval-by-shape challenge and the retrieval-by-culture challenge. The former aims at evaluating the ability of retrieval methods to discriminate cultural heritage objects by overall shape. The latter focuses on assessing the effectiveness of retrieving objects from the same culture. Both challenges constitute a suitable scenario to evaluate modern shape retrieval methods in a CH domain. Ten groups participated in the challenges: thirty runs were submitted for the retrieval-by-shape task, and twenty-six runs were submitted for the retrieval-by-culture task. The results show a predominance of learning methods on image-based multi-view representations to characterize 3D objects. Nevertheless, the problem presented in our challenges is far from being solved. We also identify the potential paths for further improvements and give insights into the future directions of research