3,771 research outputs found
Stochastic trapping in a solvable model of on-line independent component analysis
Previous analytical studies of on-line Independent Component Analysis (ICA)
learning rules have focussed on asymptotic stability and efficiency. In
practice the transient stages of learning will often be more significant in
determining the success of an algorithm. This is demonstrated here with an
analysis of a Hebbian ICA algorithm which can find a small number of
non-Gaussian components given data composed of a linear mixture of independent
source signals. An idealised data model is considered in which the sources
comprise a number of non-Gaussian and Gaussian sources and a solution to the
dynamics is obtained in the limit where the number of Gaussian sources is
infinite. Previous stability results are confirmed by expanding around optimal
fixed points, where a closed form solution to the learning dynamics is
obtained. However, stochastic effects are shown to stabilise otherwise unstable
sub-optimal fixed points. Conditions required to destabilise one such fixed
point are obtained for the case of a single non-Gaussian component, indicating
that the initial learning rate \eta required to successfully escape is very low
(\eta = O(N^{-2}) where N is the data dimension) resulting in very slow
learning typically requiring O(N^3) iterations. Simulations confirm that this
picture holds for a finite system.Comment: 17 pages, 3 figures. To appear in Neural Computatio
Analysis of Natural Gradient Descent for Multilayer Neural Networks
Natural gradient descent is a principled method for adapting the parameters
of a statistical model on-line using an underlying Riemannian parameter space
to redefine the direction of steepest descent. The algorithm is examined via
methods of statistical physics which accurately characterize both transient and
asymptotic behavior. A solution of the learning dynamics is obtained for the
case of multilayer neural network training in the limit of large input
dimension. We find that natural gradient learning leads to optimal asymptotic
performance and outperforms gradient descent in the transient, significantly
shortening or even removing plateaus in the transient generalization
performance which typically hamper gradient descent training.Comment: 14 pages including figures. To appear in Physical Review
BayesBinMix: an R Package for Model Based Clustering of Multivariate Binary Data
The BayesBinMix package offers a Bayesian framework for clustering binary
data with or without missing values by fitting mixtures of multivariate
Bernoulli distributions with an unknown number of components. It allows the
joint estimation of the number of clusters and model parameters using Markov
chain Monte Carlo sampling. Heated chains are run in parallel and accelerate
the convergence to the target posterior distribution. Identifiability issues
are addressed by implementing label switching algorithms. The package is
demonstrated and benchmarked against the Expectation-Maximization algorithm
using a simulation study as well as a real dataset.Comment: Accepted to the R Journal. The package is available on CRAN:
https://CRAN.R-project.org/package=BayesBinMi
The Dynamics of a Genetic Algorithm for a Simple Learning Problem
A formalism for describing the dynamics of Genetic Algorithms (GAs) using
methods from statistical mechanics is applied to the problem of generalization
in a perceptron with binary weights. The dynamics are solved for the case where
a new batch of training patterns is presented to each population member each
generation, which considerably simplifies the calculation. The theory is shown
to agree closely to simulations of a real GA averaged over many runs,
accurately predicting the mean best solution found. For weak selection and
large problem size the difference equations describing the dynamics can be
expressed analytically and we find that the effects of noise due to the finite
size of each training batch can be removed by increasing the population size
appropriately. If this population resizing is used, one can deduce the most
computationally efficient size of training batch each generation. For
independent patterns this choice also gives the minimum total number of
training patterns used. Although using independent patterns is a very
inefficient use of training patterns in general, this work may also prove
useful for determining the optimum batch size in the case where patterns are
recycled.Comment: 28 pages, 4 Postscript figures. Latex using IOP macros ioplppt and
iopl12 which are included. To appear in Journal of Physics A. Also available
at ftp://ftp.cs.man.ac.uk/pub/ai/jls/GAlearn.ps.gz and
http://www.cs.man.ac.uk/~jl
Bayesian estimation of Differential Transcript Usage from RNA-seq data
Next generation sequencing allows the identification of genes consisting of
differentially expressed transcripts, a term which usually refers to changes in
the overall expression level. A specific type of differential expression is
differential transcript usage (DTU) and targets changes in the relative within
gene expression of a transcript. The contribution of this paper is to: (a)
extend the use of cjBitSeq to the DTU context, a previously introduced Bayesian
model which is originally designed for identifying changes in overall
expression levels and (b) propose a Bayesian version of DRIMSeq, a frequentist
model for inferring DTU. cjBitSeq is a read based model and performs fully
Bayesian inference by MCMC sampling on the space of latent state of each
transcript per gene. BayesDRIMSeq is a count based model and estimates the
Bayes Factor of a DTU model against a null model using Laplace's approximation.
The proposed models are benchmarked against the existing ones using a recent
independent simulation study as well as a real RNA-seq dataset. Our results
suggest that the Bayesian methods exhibit similar performance with DRIMSeq in
terms of precision/recall but offer better calibration of False Discovery Rate.Comment: Revised version, accepted to Statistical Applications in Genetics and
Molecular Biolog
- …