86 research outputs found
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Gaussian mixture models (GMM) are powerful parametric tools with many
applications in machine learning and computer vision. Expectation maximization
(EM) is the most popular algorithm for estimating the GMM parameters. However,
EM guarantees only convergence to a stationary point of the log-likelihood
function, which could be arbitrarily worse than the optimal solution. Inspired
by the relationship between the negative log-likelihood function and the
Kullback-Leibler (KL) divergence, we propose an alternative formulation for
estimating the GMM parameters using the sliced Wasserstein distance, which
gives rise to a new algorithm. Specifically, we propose minimizing the
sliced-Wasserstein distance between the mixture model and the data distribution
with respect to the GMM parameters. In contrast to the KL-divergence, the
energy landscape for the sliced-Wasserstein distance is more well-behaved and
therefore more suitable for a stochastic gradient descent scheme to obtain the
optimal GMM parameters. We show that our formulation results in parameter
estimates that are more robust to random initializations and demonstrate that
it can estimate high-dimensional data distributions more faithfully than the EM
algorithm
Adversarial Example Detection and Classification With Asymmetrical Adversarial Training
The vulnerabilities of deep neural networks against adversarial examples have
become a significant concern for deploying these models in sensitive domains.
Devising a definitive defense against such attacks is proven to be challenging,
and the methods relying on detecting adversarial samples are only valid when
the attacker is oblivious to the detection mechanism. In this paper we first
present an adversarial example detection method that provides performance
guarantee to norm constrained adversaries. The method is based on the idea of
training adversarial robust subspace detectors using asymmetrical adversarial
training (AAT). The novel AAT objective presents a minimax problem similar to
that of GANs; it has the same convergence property, and consequently supports
the learning of class conditional distributions. We first demonstrate that the
minimax problem could be reasonably solved by PGD attack, and then use the
learned class conditional generative models to define generative
detection/classification models that are both robust and more interpretable. We
provide comprehensive evaluations of the above methods, and demonstrate their
competitive performances and compelling properties on adversarial detection and
robust classification problems.Comment: ICLR 202
Analyzing and Improving Generative Adversarial Training for Generative Modeling and Out-of-Distribution Detection
Generative adversarial training (GAT) is a recently introduced adversarial
defense method. Previous works have focused on empirical evaluations of its
application to training robust predictive models. In this paper we focus on
theoretical understanding of the GAT method and extending its application to
generative modeling and out-of-distribution detection. We analyze the optimal
solutions of the maximin formulation employed by the GAT objective, and make a
comparative analysis of the minimax formulation employed by GANs. We use
theoretical analysis and 2D simulations to understand the convergence property
of the training algorithm. Based on these results, we develop an incremental
generative training algorithm, and conduct comprehensive evaluations of the
algorithm's application to image generation and adversarial out-of-distribution
detection. Our results suggest that generative adversarial training is a
promising new direction for the above applications
A General System for Automatic Biomedical Image Segmentation Using Intensity Neighborhoods
Image segmentation is important with applications to several problems in biology and medicine. While extensively researched, generally, current segmentation methods perform adequately in the applications for which they were designed, but often require extensive modifications or calibrations before being used in a different application. We describe an approach that, with few modifications, can be used in a variety of image segmentation problems. The approach is based on a supervised learning strategy that utilizes intensity neighborhoods to assign each pixel in a test image its correct class based on training data. We describe methods for modeling rotations and variations in scales as well as a subset selection for training the classifiers. We show that the performance of our approach in tissue segmentation tasks in magnetic resonance and histopathology microscopy images, as well as nuclei segmentation from fluorescence microscopy images, is similar to or better than several algorithms specifically designed for each of these applications
Generalized Sliced Wasserstein Distances
The Wasserstein distance and its variations, e.g., the sliced-Wasserstein
(SW) distance, have recently drawn attention from the machine learning
community. The SW distance, specifically, was shown to have similar properties
to the Wasserstein distance, while being much simpler to compute, and is
therefore used in various applications including generative modeling and
general supervised/unsupervised learning. In this paper, we first clarify the
mathematical connection between the SW distance and the Radon transform. We
then utilize the generalized Radon transform to define a new family of
distances for probability measures, which we call generalized
sliced-Wasserstein (GSW) distances. We also show that, similar to the SW
distance, the GSW distance can be extended to a maximum GSW (max-GSW) distance.
We then provide the conditions under which GSW and max-GSW distances are indeed
distances. Finally, we compare the numerical performance of the proposed
distances on several generative modeling tasks, including SW flows and SW
auto-encoders
- …