7,860 research outputs found
Non-strongly-convex smooth stochastic approximation with convergence rate O(1/n)
We consider the stochastic approximation problem where a convex function has
to be minimized, given only the knowledge of unbiased estimates of its
gradients at certain points, a framework which includes machine learning
methods based on the minimization of the empirical risk. We focus on problems
without strong convexity, for which all previously known algorithms achieve a
convergence rate for function values of O(1/n^{1/2}). We consider and analyze
two algorithms that achieve a rate of O(1/n) for classical supervised learning
problems. For least-squares regression, we show that averaged stochastic
gradient descent with constant step-size achieves the desired rate. For
logistic regression, this is achieved by a simple novel stochastic gradient
algorithm that (a) constructs successive local quadratic approximations of the
loss functions, while (b) preserving the same running time complexity as
stochastic gradient descent. For these algorithms, we provide a non-asymptotic
analysis of the generalization error (in expectation, and also in high
probability for least-squares), and run extensive experiments on standard
machine learning benchmarks showing that they often outperform existing
approaches
Unconventional machine learning of genome-wide human cancer data
Recent advances in high-throughput genomic technologies coupled with
exponential increases in computer processing and memory have allowed us to
interrogate the complex aberrant molecular underpinnings of human disease from
a genome-wide perspective. While the deluge of genomic information is expected
to increase, a bottleneck in conventional high-performance computing is rapidly
approaching. Inspired in part by recent advances in physical quantum
processors, we evaluated several unconventional machine learning (ML)
strategies on actual human tumor data. Here we show for the first time the
efficacy of multiple annealing-based ML algorithms for classification of
high-dimensional, multi-omics human cancer data from the Cancer Genome Atlas.
To assess algorithm performance, we compared these classifiers to a variety of
standard ML methods. Our results indicate the feasibility of using
annealing-based ML to provide competitive classification of human cancer types
and associated molecular subtypes and superior performance with smaller
training datasets, thus providing compelling empirical evidence for the
potential future application of unconventional computing architectures in the
biomedical sciences
Semistochastic Quadratic Bound Methods
Partition functions arise in a variety of settings, including conditional
random fields, logistic regression, and latent gaussian models. In this paper,
we consider semistochastic quadratic bound (SQB) methods for maximum likelihood
inference based on partition function optimization. Batch methods based on the
quadratic bound were recently proposed for this class of problems, and
performed favorably in comparison to state-of-the-art techniques.
Semistochastic methods fall in between batch algorithms, which use all the
data, and stochastic gradient type methods, which use small random selections
at each iteration. We build semistochastic quadratic bound-based methods, and
prove both global convergence (to a stationary point) under very weak
assumptions, and linear convergence rate under stronger assumptions on the
objective. To make the proposed methods faster and more stable, we consider
inexact subproblem minimization and batch-size selection schemes. The efficacy
of SQB methods is demonstrated via comparison with several state-of-the-art
techniques on commonly used datasets.Comment: 11 pages, 1 figur
Recommended from our members
d-QPSO: A Quantum-Behaved Particle Swarm Technique for Finding D-Optimal Designs With Discrete and Continuous Factors and a Binary Response
Identifying optimal designs for generalized linear models with a binary response can be a challengingtask, especially when there are both discrete and continuous independent factors in the model. Theoreticalresults rarely exist for such models, and for the handful that do, they usually come with restrictive assumptions.In this article, we propose the d-QPSO algorithm, a modified version of quantum-behaved particleswarm optimization, to find a variety of D-optimal approximate and exact designs for experiments withdiscrete and continuous factors and a binary response. We show that the d-QPSO algorithm can efficientlyfind locally D-optimal designs even for experiments with a large number of factors and robust pseudo-Bayesian designs when nominal values for the model parameters are not available. Additionally, we investigaterobustness properties of the d-QPSO algorithm-generated designs to variousmodel assumptions andprovide real applications to design a bio-plastics odor removal experiment, an electronic static experiment,and a 10-factor car refueling experiment. Supplementary materials for the article are available online
1-Bit Matrix Completion
In this paper we develop a theory of matrix completion for the extreme case
of noisy 1-bit observations. Instead of observing a subset of the real-valued
entries of a matrix M, we obtain a small number of binary (1-bit) measurements
generated according to a probability distribution determined by the real-valued
entries of M. The central question we ask is whether or not it is possible to
obtain an accurate estimate of M from this data. In general this would seem
impossible, but we show that the maximum likelihood estimate under a suitable
constraint returns an accurate estimate of M when ||M||_{\infty} <= \alpha, and
rank(M) <= r. If the log-likelihood is a concave function (e.g., the logistic
or probit observation models), then we can obtain this maximum likelihood
estimate by optimizing a convex program. In addition, we also show that if
instead of recovering M we simply wish to obtain an estimate of the
distribution generating the 1-bit measurements, then we can eliminate the
requirement that ||M||_{\infty} <= \alpha. For both cases, we provide lower
bounds showing that these estimates are near-optimal. We conclude with a suite
of experiments that both verify the implications of our theorems as well as
illustrate some of the practical applications of 1-bit matrix completion. In
particular, we compare our program to standard matrix completion methods on
movie rating data in which users submit ratings from 1 to 5. In order to use
our program, we quantize this data to a single bit, but we allow the standard
matrix completion program to have access to the original ratings (from 1 to 5).
Surprisingly, the approach based on binary data performs significantly better
Recommended from our members
Prediction of progression in idiopathic pulmonary fibrosis using CT scans atbaseline: A quantum particle swarm optimization - Random forest approach
Idiopathic pulmonary fibrosis (IPF) is a fatal lung disease characterized by an unpredictable progressive declinein lung function. Natural history of IPF is unknown and the prediction of disease progression at the time ofdiagnosis is notoriously difficult. High resolution computed tomography (HRCT) has been used for the diagnosisof IPF, but not generally for monitoring purpose. The objective of this work is to develop a novel predictivemodel for the radiological progression pattern at voxel-wise level using only baseline HRCT scans. Mainly, thereare two challenges: (a) obtaining a data set of features for region of interest (ROI) on baseline HRCT scans andtheir follow-up status; and (b) simultaneously selecting important features from high-dimensional space, andoptimizing the prediction performance. We resolved the first challenge by implementing a study design andhaving an expert radiologist contour ROIs at baseline scans, depending on its progression status in follow-upvisits. For the second challenge, we integrated the feature selection with prediction by developing an algorithmusing a wrapper method that combines quantum particle swarm optimization to select a small number of featureswith random forest to classify early patterns of progression. We applied our proposed algorithm to analyzeanonymized HRCT images from 50 IPF subjects from a multi-center clinical trial. We showed that it yields aparsimonious model with 81.8% sensitivity, 82.2% specificity and an overall accuracy rate of 82.1% at the ROIlevel. These results are superior to other popular feature selections and classification methods, in that ourmethod produces higher accuracy in prediction of progression and more balanced sensitivity and specificity witha smaller number of selected features. Our work is the first approach to show that it is possible to use onlybaseline HRCT scans to predict progressive ROIs at 6 months to 1year follow-ups using artificial intelligence
- …