3 research outputs found
Latent Variable Modeling with Diversity-Inducing Mutual Angular Regularization
Latent Variable Models (LVMs) are a large family of machine learning models
providing a principled and effective way to extract underlying patterns,
structure and knowledge from observed data. Due to the dramatic growth of
volume and complexity of data, several new challenges have emerged and cannot
be effectively addressed by existing LVMs: (1) How to capture long-tail
patterns that carry crucial information when the popularity of patterns is
distributed in a power-law fashion? (2) How to reduce model complexity and
computational cost without compromising the modeling power of LVMs? (3) How to
improve the interpretability and reduce the redundancy of discovered patterns?
To addresses the three challenges discussed above, we develop a novel
regularization technique for LVMs, which controls the geometry of the latent
space during learning to enable the learned latent components of LVMs to be
diverse in the sense that they are favored to be mutually different from each
other, to accomplish long-tail coverage, low redundancy, and better
interpretability. We propose a mutual angular regularizer (MAR) to encourage
the components in LVMs to have larger mutual angles. The MAR is non-convex and
non-smooth, entailing great challenges for optimization. To cope with this
issue, we derive a smooth lower bound of the MAR and optimize the lower bound
instead. We show that the monotonicity of the lower bound is closely aligned
with the MAR to qualify the lower bound as a desirable surrogate of the MAR.
Using neural network (NN) as an instance, we analyze how the MAR affects the
generalization performance of NN. On two popular latent variable models ---
restricted Boltzmann machine and distance metric learning, we demonstrate that
MAR can effectively capture long-tail patterns, reduce model complexity without
sacrificing expressivity and improve interpretability
MAE: Mutual Posterior-Divergence Regularization for Variational AutoEncoders
Variational Autoencoder (VAE), a simple and effective deep generative model,
has led to a number of impressive empirical successes and spawned many advanced
variants and theoretical investigations. However, recent studies demonstrate
that, when equipped with expressive generative distributions (aka. decoders),
VAE suffers from learning uninformative latent representations with the
observation called KL Varnishing, in which case VAE collapses into an
unconditional generative model. In this work, we introduce mutual
posterior-divergence regularization, a novel regularization that is able to
control the geometry of the latent space to accomplish meaningful
representation learning, while achieving comparable or superior capability of
density estimation. Experiments on three image benchmark datasets demonstrate
that, when equipped with powerful decoders, our model performs well both on
density estimation and representation learning.Comment: Published at ICLR-2019. 12 pages contents + 4 pages appendix, 5
figure
Diversity in Machine Learning
Machine learning methods have achieved good performance and been widely
applied in various real-world applications. They can learn the model adaptively
and be better fit for special requirements of different tasks. Generally, a
good machine learning system is composed of plentiful training data, a good
model training process, and an accurate inference. Many factors can affect the
performance of the machine learning process, among which the diversity of the
machine learning process is an important one. The diversity can help each
procedure to guarantee a total good machine learning: diversity of the training
data ensures that the training data can provide more discriminative information
for the model, diversity of the learned model (diversity in parameters of each
model or diversity among different base models) makes each parameter/model
capture unique or complement information and the diversity in inference can
provide multiple choices each of which corresponds to a specific plausible
local optimal result. Even though the diversity plays an important role in
machine learning process, there is no systematical analysis of the
diversification in machine learning system. In this paper, we systematically
summarize the methods to make data diversification, model diversification, and
inference diversification in the machine learning process, respectively. In
addition, the typical applications where the diversity technology improved the
machine learning performance have been surveyed, including the remote sensing
imaging tasks, machine translation, camera relocalization, image segmentation,
object detection, topic modeling, and others. Finally, we discuss some
challenges of the diversity technology in machine learning and point out some
directions in future work.Comment: Accepted by IEEE Acces