3,564 research outputs found
Generalized power method for sparse principal component analysis
In this paper we develop a new approach to sparse principal component
analysis (sparse PCA). We propose two single-unit and two block optimization
formulations of the sparse PCA problem, aimed at extracting a single sparse
dominant principal component of a data matrix, or more components at once,
respectively. While the initial formulations involve nonconvex functions, and
are therefore computationally intractable, we rewrite them into the form of an
optimization program involving maximization of a convex function on a compact
set. The dimension of the search space is decreased enormously if the data
matrix has many more columns (variables) than rows. We then propose and analyze
a simple gradient method suited for the task. It appears that our algorithm has
best convergence properties in the case when either the objective function or
the feasible set are strongly convex, which is the case with our single-unit
formulations and can be enforced in the block case. Finally, we demonstrate
numerically on a set of random and gene expression test problems that our
approach outperforms existing algorithms both in quality of the obtained
solution and in computational speed.Comment: Submitte
Generalized power method for sparse principal component analysis
In this paper we develop a new approach to sparse principal component analysis (sparse PCA). We propose two single-unit and two block optimization formulations of the sparse PCA problem, aimed at extracting a single sparse dominant principal component of a data matrix, or more components at once, respectively. While the initial formulations involve nonconvex functions, and are therefore computationally intractable, we rewrite them into the form of an optimization program involving maximization of a convex function on a compact set. The dimension of the search space is decreased enormously if the data matrix has many more columns (variables) than rows. We then propose and analyze a simple gradient method suited for the task. It appears that our algorithm has best convergence properties in the case when either the objective function or the feasible set are strongly convex, which is the case with our single-unit formulations and can be enforced in the block case. Finally, we demonstrate numerically on a set of random and gene expression test problems that our approach outperforms existing algorithms both in quality of the obtained solution and in computational speed.sparse PCA, power method, gradient ascent, strongly convex sets, block algorithms.
Control of quantum phenomena: Past, present, and future
Quantum control is concerned with active manipulation of physical and
chemical processes on the atomic and molecular scale. This work presents a
perspective of progress in the field of control over quantum phenomena, tracing
the evolution of theoretical concepts and experimental methods from early
developments to the most recent advances. The current experimental successes
would be impossible without the development of intense femtosecond laser
sources and pulse shapers. The two most critical theoretical insights were (1)
realizing that ultrafast atomic and molecular dynamics can be controlled via
manipulation of quantum interferences and (2) understanding that optimally
shaped ultrafast laser pulses are the most effective means for producing the
desired quantum interference patterns in the controlled system. Finally, these
theoretical and experimental advances were brought together by the crucial
concept of adaptive feedback control, which is a laboratory procedure employing
measurement-driven, closed-loop optimization to identify the best shapes of
femtosecond laser control pulses for steering quantum dynamics towards the
desired objective. Optimization in adaptive feedback control experiments is
guided by a learning algorithm, with stochastic methods proving to be
especially effective. Adaptive feedback control of quantum phenomena has found
numerous applications in many areas of the physical and chemical sciences, and
this paper reviews the extensive experiments. Other subjects discussed include
quantum optimal control theory, quantum control landscapes, the role of
theoretical control designs in experimental realizations, and real-time quantum
feedback control. The paper concludes with a prospective of open research
directions that are likely to attract significant attention in the future.Comment: Review article, final version (significantly updated), 76 pages,
accepted for publication in New J. Phys. (Focus issue: Quantum control
Additive Multi-Index Gaussian process modeling, with application to multi-physics surrogate modeling of the quark-gluon plasma
The Quark-Gluon Plasma (QGP) is a unique phase of nuclear matter, theorized
to have filled the Universe shortly after the Big Bang. A critical challenge in
studying the QGP is that, to reconcile experimental observables with
theoretical parameters, one requires many simulation runs of a complex physics
model over a high-dimensional parameter space. Each run is computationally very
expensive, requiring thousands of CPU hours, thus limiting physicists to only
several hundred runs. Given limited training data for high-dimensional
prediction, existing surrogate models often yield poor predictions with high
predictive uncertainties, leading to imprecise scientific findings. To address
this, we propose a new Additive Multi-Index Gaussian process (AdMIn-GP) model,
which leverages a flexible additive structure on low-dimensional embeddings of
the parameter space. This is guided by prior scientific knowledge that the QGP
is dominated by multiple distinct physical phenomena (i.e., multiphysics), each
involving a small number of latent parameters. The AdMIn-GP models for such
embedded structures within a flexible Bayesian nonparametric framework, which
facilitates efficient model fitting via a carefully constructed variational
inference approach with inducing points. We show the effectiveness of the
AdMIn-GP via a suite of numerical experiments and our QGP application, where we
demonstrate considerably improved surrogate modeling performance over existing
models
Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches
Imaging spectrometers measure electromagnetic energy scattered in their
instantaneous field view in hundreds or thousands of spectral channels with
higher spectral resolution than multispectral cameras. Imaging spectrometers
are therefore often referred to as hyperspectral cameras (HSCs). Higher
spectral resolution enables material identification via spectroscopic analysis,
which facilitates countless applications that require identifying materials in
scenarios unsuitable for classical spectroscopic analysis. Due to low spatial
resolution of HSCs, microscopic material mixing, and multiple scattering,
spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus,
accurate estimation requires unmixing. Pixels are assumed to be mixtures of a
few materials, called endmembers. Unmixing involves estimating all or some of:
the number of endmembers, their spectral signatures, and their abundances at
each pixel. Unmixing is a challenging, ill-posed inverse problem because of
model inaccuracies, observation noise, environmental conditions, endmember
variability, and data set size. Researchers have devised and investigated many
models searching for robust, stable, tractable, and accurate unmixing
algorithms. This paper presents an overview of unmixing methods from the time
of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models
are first discussed. Signal-subspace, geometrical, statistical, sparsity-based,
and spatial-contextual unmixing algorithms are described. Mathematical problems
and potential solutions are described. Algorithm characteristics are
illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensin
Local learning by partitioning
In many machine learning applications data is assumed to be locally simple, where examples near each other have similar characteristics such as class labels or regression responses. Our goal is to exploit this assumption to construct locally simple yet globally complex systems that improve performance or reduce the cost of common machine learning tasks. To this end, we address three main problems: discovering and separating local non-linear structure in high-dimensional data, learning low-complexity local systems to improve performance of risk-based learning tasks, and exploiting local similarity to reduce the test-time cost of learning algorithms.
First, we develop a structure-based similarity metric, where low-dimensional non-linear structure is captured by solving a non-linear, low-rank representation problem. We show that this problem can be kernelized, has a closed-form solution, naturally separates independent manifolds, and is robust to noise. Experimental results indicate that incorporating this structural similarity in well-studied problems such as clustering, anomaly detection, and classification improves performance.
Next, we address the problem of local learning, where a partitioning function divides the feature space into regions where independent functions are applied. We focus on the problem of local linear classification using linear partitioning and local decision functions. Under an alternating minimization scheme, learning the partitioning functions can be reduced to solving a weighted supervised learning problem. We then present a novel reformulation that yields a globally convex surrogate, allowing for efficient, joint training of the partitioning functions and local classifiers.
We then examine the problem of learning under test-time budgets, where acquiring sensors (features) for each example during test-time has a cost. Our goal is to partition the space into regions, with only a small subset of sensors needed in each region, reducing the average number of sensors required per example. Starting with a cascade structure and expanding to binary trees, we formulate this problem as an empirical risk minimization and construct an upper-bounding surrogate that allows for sequential decision functions to be trained jointly by solving a linear program. Finally, we present preliminary work extending the notion of test-time budgets to the problem of adaptive privacy
A submodular optimization framework for never-ending learning : semi-supervised, online, and active learning.
The revolution in information technology and the explosion in the use of computing devices in people\u27s everyday activities has forever changed the perspective of the data mining and machine learning fields. The enormous amounts of easily accessible, information rich data is pushing the data analysis community in general towards a shift of paradigm. In the new paradigm, data comes in the form a stream of billions of records received everyday. The dynamic nature of the data and its sheer size makes it impossible to use the traditional notion of offline learning where the whole data is accessible at any time point. Moreover, no amount of human resources is enough to get expert feedback on the data. In this work we have developed a unified optimization based learning framework that approaches many of the challenges mentioned earlier. Specifically, we developed a Never-Ending Learning framework which combines incremental/online, semi-supervised, and active learning under a unified optimization framework. The established framework is based on the class of submodular optimization methods. At the core of this work we provide a novel formulation of the Semi-Supervised Support Vector Machines (S3VM) in terms of submodular set functions. The new formulation overcomes the non-convexity issues of the S3VM and provides a state of the art solution that is orders of magnitude faster than the cutting edge algorithms in the literature. Next, we provide a stream summarization technique via exemplar selection. This technique makes it possible to keep a fixed size exemplar representation of a data stream that can be used by any label propagation based semi-supervised learning technique. The compact data steam representation allows a wide range of algorithms to be extended to incremental/online learning scenario. Under the same optimization framework, we provide an active learning algorithm that constitute the feedback between the learning machine and an oracle. Finally, the developed Never-Ending Learning framework is essentially transductive in nature. Therefore, our last contribution is an inductive incremental learning technique for incremental training of SVM using the properties of local kernels. We demonstrated through this work the importance and wide applicability of the proposed methodologies
- …