117 research outputs found
A Simple Iterative Algorithm for Parsimonious Binary Kernel Fisher Discrimination
By applying recent results in optimization theory variously known as optimization transfer or majorize/minimize algorithms, an algorithm for binary, kernel, Fisher discriminant analysis is introduced that makes use of a non-smooth penalty on the coefficients to provide a parsimonious solution. The problem is converted into a smooth optimization that can be solved iteratively with no greater overhead than iteratively re-weighted least-squares. The result is simple, easily programmed and is shown to perform, in terms of both accuracy and parsimony, as well as or better than a number of leading machine learning algorithms on two well-studied and substantial benchmarks
Probabilistic Reduced-Order Modeling for Stochastic Partial Differential Equations
We discuss a Bayesian formulation to coarse-graining (CG) of PDEs where the
coefficients (e.g. material parameters) exhibit random, fine scale variability.
The direct solution to such problems requires grids that are small enough to
resolve this fine scale variability which unavoidably requires the repeated
solution of very large systems of algebraic equations. We establish a
physically inspired, data-driven coarse-grained model which learns a low-
dimensional set of microstructural features that are predictive of the
fine-grained model (FG) response. Once learned, those features provide a sharp
distribution over the coarse scale effec- tive coefficients of the PDE that are
most suitable for prediction of the fine scale model output. This ultimately
allows to replace the computationally expensive FG by a generative proba-
bilistic model based on evaluating the much cheaper CG several times. Sparsity
enforcing pri- ors further increase predictive efficiency and reveal
microstructural features that are important in predicting the FG response.
Moreover, the model yields probabilistic rather than single-point predictions,
which enables the quantification of the unavoidable epistemic uncertainty that
is present due to the information loss that occurs during the coarse-graining
process
A sparse multinomial probit model for classification
A recent development in penalized probit modelling using a hierarchical Bayesian approach has led to a sparse binomial (two-class) probit classifier that can be trained via an EM algorithm. A key advantage of the formulation is that no tuning of hyperparameters relating to the penalty is needed thus simplifying the model selection process. The resulting model demonstrates excellent classification performance and a high degree of sparsity when used as a kernel machine. It is, however, restricted to the binary classification problem and can only be used in the multinomial situation via a one-against-all or one-against-many strategy. To overcome this, we apply the idea to the multinomial probit model. This leads to a direct multi-classification approach and is shown to give a sparse solution with accuracy and sparsity comparable with the current state-of-the-art. Comparative numerical benchmark examples are used to demonstrate the method
Bayesian Estimation of Intensity Surfaces on the Sphere via Needlet Shrinkage and Selection
This paper describes an approach for Bayesian modeling in spherical datasets. Our method is based upon a recent construction called the needlet, which is a particular form of spherical wavelet with many favorable statistical and computational properties. We perform shrinkage and selection of needlet coefficients, focusing on two main alternatives: empirical-Bayes thresholding, and Bayesian local shrinkage rules. We study the performance of the proposed methodology both on simulated data and on two real data sets: one involving the cosmic microwave background radiation, and one involving the reconstruction of a global news intensity surface inferred from published Reuters articles in August, 1996. The fully Bayesian approach based on robust, sparse shrinkage priors seems to outperform other alternatives.Business Administratio
A Hierarchical Bayesian Framework for Constructing Sparsity-inducing Priors
Variable selection techniques have become increasingly popular amongst
statisticians due to an increased number of regression and classification
applications involving high-dimensional data where we expect some predictors to
be unimportant. In this context, Bayesian variable selection techniques
involving Markov chain Monte Carlo exploration of the posterior distribution
over models can be prohibitively computationally expensive and so there has
been attention paid to quasi-Bayesian approaches such as maximum a posteriori
(MAP) estimation using priors that induce sparsity in such estimates. We focus
on this latter approach, expanding on the hierarchies proposed to date to
provide a Bayesian interpretation and generalization of state-of-the-art
penalized optimization approaches and providing simultaneously a natural way to
include prior information about parameters within this framework. We give
examples of how to use this hierarchy to compute MAP estimates for linear and
logistic regression as well as sparse precision-matrix estimates in Gaussian
graphical models. In addition, an adaptive group lasso method is derived using
the framework.Comment: Submitted for publication; corrected typo
Sparse Bilinear Logistic Regression
In this paper, we introduce the concept of sparse bilinear logistic
regression for decision problems involving explanatory variables that are
two-dimensional matrices. Such problems are common in computer vision,
brain-computer interfaces, style/content factorization, and parallel factor
analysis. The underlying optimization problem is bi-convex; we study its
solution and develop an efficient algorithm based on block coordinate descent.
We provide a theoretical guarantee for global convergence and estimate the
asymptotical convergence rate using the Kurdyka-{\L}ojasiewicz inequality. A
range of experiments with simulated and real data demonstrate that sparse
bilinear logistic regression outperforms current techniques in several
important applications.Comment: 27 pages, 5 figure
- …