1,275 research outputs found
ADADELTA: An Adaptive Learning Rate Method
We present a novel per-dimension learning rate method for gradient descent
called ADADELTA. The method dynamically adapts over time using only first order
information and has minimal computational overhead beyond vanilla stochastic
gradient descent. The method requires no manual tuning of a learning rate and
appears robust to noisy gradient information, different model architecture
choices, various data modalities and selection of hyperparameters. We show
promising results compared to other methods on the MNIST digit classification
task using a single machine and on a large scale voice dataset in a distributed
cluster environment.Comment: 6 page
Unsupervised Generative Modeling Using Matrix Product States
Generative modeling, which learns joint probability distribution from data
and generates samples according to it, is an important task in machine learning
and artificial intelligence. Inspired by probabilistic interpretation of
quantum physics, we propose a generative model using matrix product states,
which is a tensor network originally proposed for describing (particularly
one-dimensional) entangled quantum states. Our model enjoys efficient learning
analogous to the density matrix renormalization group method, which allows
dynamically adjusting dimensions of the tensors and offers an efficient direct
sampling approach for generative tasks. We apply our method to generative
modeling of several standard datasets including the Bars and Stripes, random
binary patterns and the MNIST handwritten digits to illustrate the abilities,
features and drawbacks of our model over popular generative models such as
Hopfield model, Boltzmann machines and generative adversarial networks. Our
work sheds light on many interesting directions of future exploration on the
development of quantum-inspired algorithms for unsupervised machine learning,
which are promisingly possible to be realized on quantum devices.Comment: 11 pages, 12 figures (not including the TNs) GitHub Page:
https://congzlwag.github.io/UnsupGenModbyMPS
Modal Learning Neural Networks
This paper will explore the integration of learning modes into a single neural network structure in which layers of neurons or individual neurons adopt different modes. There are several reasons to explore modal learning. One motivation is to overcome the inherent limitations of any given mode (for example some modes memorise specific features, others average across features, and both approaches may be relevant according to the circumstances); another is inspiration from neuroscience, cognitive science and human learning, where it is impossible to build a serious model without consideration of multiple modes; and a third reason is non-stationary input data, or time-variant learning objectives, where the required mode is a function of time. Two modal learning ideas are presented: The Snap-Drift Neural Network (SDNN) which toggles its learning between two modes, is incorporated into an on-line system to provide carefully targeted guidance and feedback to students; and an adaptive function neural network (ADFUNN), in which adaptation applies simultaneously to both the weights and the individual neuron activation functions. The combination of the two modal learning methods, in the form of Snap-drift ADaptive FUnction Neural Network (SADFUNN) is then applied to optical and pen-based recognition of handwritten digits with results that demonstrate the effectiveness of the approach
VIGAN: Missing View Imputation with Generative Adversarial Networks
In an era when big data are becoming the norm, there is less concern with the
quantity but more with the quality and completeness of the data. In many
disciplines, data are collected from heterogeneous sources, resulting in
multi-view or multi-modal datasets. The missing data problem has been
challenging to address in multi-view data analysis. Especially, when certain
samples miss an entire view of data, it creates the missing view problem.
Classic multiple imputations or matrix completion methods are hardly effective
here when no information can be based on in the specific view to impute data
for such samples. The commonly-used simple method of removing samples with a
missing view can dramatically reduce sample size, thus diminishing the
statistical power of a subsequent analysis. In this paper, we propose a novel
approach for view imputation via generative adversarial networks (GANs), which
we name by VIGAN. This approach first treats each view as a separate domain and
identifies domain-to-domain mappings via a GAN using randomly-sampled data from
each view, and then employs a multi-modal denoising autoencoder (DAE) to
reconstruct the missing view from the GAN outputs based on paired data across
the views. Then, by optimizing the GAN and DAE jointly, our model enables the
knowledge integration for domain mappings and view correspondences to
effectively recover the missing view. Empirical results on benchmark datasets
validate the VIGAN approach by comparing against the state of the art. The
evaluation of VIGAN in a genetic study of substance use disorders further
proves the effectiveness and usability of this approach in life science.Comment: 10 pages, 8 figures, conferenc
- …