1,161 research outputs found
Multi-Modal Mean-Fields via Cardinality-Based Clamping
Mean Field inference is central to statistical physics. It has attracted much
interest in the Computer Vision community to efficiently solve problems
expressible in terms of large Conditional Random Fields. However, since it
models the posterior probability distribution as a product of marginal
probabilities, it may fail to properly account for important dependencies
between variables. We therefore replace the fully factorized distribution of
Mean Field by a weighted mixture of such distributions, that similarly
minimizes the KL-Divergence to the true posterior. By introducing two new
ideas, namely, conditioning on groups of variables instead of single ones and
using a parameter of the conditional random field potentials, that we identify
to the temperature in the sense of statistical physics to select such groups,
we can perform this minimization efficiently. Our extension of the clamping
method proposed in previous works allows us to both produce a more descriptive
approximation of the true posterior and, inspired by the diverse MAP paradigms,
fit a mixture of Mean Field approximations. We demonstrate that this positively
impacts real-world algorithms that initially relied on mean fields.Comment: Submitted for review to CVPR 201
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
The rising popularity of intelligent mobile devices and the daunting
computational cost of deep learning-based models call for efficient and
accurate on-device inference schemes. We propose a quantization scheme that
allows inference to be carried out using integer-only arithmetic, which can be
implemented more efficiently than floating point inference on commonly
available integer-only hardware. We also co-design a training procedure to
preserve end-to-end model accuracy post quantization. As a result, the proposed
quantization scheme improves the tradeoff between accuracy and on-device
latency. The improvements are significant even on MobileNets, a model family
known for run-time efficiency, and are demonstrated in ImageNet classification
and COCO detection on popular CPUs.Comment: 14 pages, 12 figure
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets
This paper describes InfoGAN, an information-theoretic extension to the
Generative Adversarial Network that is able to learn disentangled
representations in a completely unsupervised manner. InfoGAN is a generative
adversarial network that also maximizes the mutual information between a small
subset of the latent variables and the observation. We derive a lower bound to
the mutual information objective that can be optimized efficiently, and show
that our training procedure can be interpreted as a variation of the Wake-Sleep
algorithm. Specifically, InfoGAN successfully disentangles writing styles from
digit shapes on the MNIST dataset, pose from lighting of 3D rendered images,
and background digits from the central digit on the SVHN dataset. It also
discovers visual concepts that include hair styles, presence/absence of
eyeglasses, and emotions on the CelebA face dataset. Experiments show that
InfoGAN learns interpretable representations that are competitive with
representations learned by existing fully supervised methods
Quantum-Assisted Learning of Hardware-Embedded Probabilistic Graphical Models
Mainstream machine-learning techniques such as deep learning and
probabilistic programming rely heavily on sampling from generally intractable
probability distributions. There is increasing interest in the potential
advantages of using quantum computing technologies as sampling engines to speed
up these tasks or to make them more effective. However, some pressing
challenges in state-of-the-art quantum annealers have to be overcome before we
can assess their actual performance. The sparse connectivity, resulting from
the local interaction between quantum bits in physical hardware
implementations, is considered the most severe limitation to the quality of
constructing powerful generative unsupervised machine-learning models. Here we
use embedding techniques to add redundancy to data sets, allowing us to
increase the modeling capacity of quantum annealers. We illustrate our findings
by training hardware-embedded graphical models on a binarized data set of
handwritten digits and two synthetic data sets in experiments with up to 940
quantum bits. Our model can be trained in quantum hardware without full
knowledge of the effective parameters specifying the corresponding quantum
Gibbs-like distribution; therefore, this approach avoids the need to infer the
effective temperature at each iteration, speeding up learning; it also
mitigates the effect of noise in the control parameters, making it robust to
deviations from the reference Gibbs distribution. Our approach demonstrates the
feasibility of using quantum annealers for implementing generative models, and
it provides a suitable framework for benchmarking these quantum technologies on
machine-learning-related tasks.Comment: 17 pages, 8 figures. Minor further revisions. As published in Phys.
Rev.
- …