15,867 research outputs found
Preconditioning Kernel Matrices
The computational and storage complexity of kernel machines presents the
primary barrier to their scaling to large, modern, datasets. A common way to
tackle the scalability issue is to use the conjugate gradient algorithm, which
relieves the constraints on both storage (the kernel matrix need not be stored)
and computation (both stochastic gradients and parallelization can be used).
Even so, conjugate gradient is not without its own issues: the conditioning of
kernel matrices is often such that conjugate gradients will have poor
convergence in practice. Preconditioning is a common approach to alleviating
this issue. Here we propose preconditioned conjugate gradients for kernel
machines, and develop a broad range of preconditioners particularly useful for
kernel matrices. We describe a scalable approach to both solving kernel
machines and learning their hyperparameters. We show this approach is exact in
the limit of iterations and outperforms state-of-the-art approximations for a
given computational budget
Inconsistency of Bayesian Inference for Misspecified Linear Models, and a Proposal for Repairing It
We empirically show that Bayesian inference can be inconsistent under
misspecification in simple linear regression problems, both in a model
averaging/selection and in a Bayesian ridge regression setting. We use the
standard linear model, which assumes homoskedasticity, whereas the data are
heteroskedastic, and observe that the posterior puts its mass on ever more
high-dimensional models as the sample size increases. To remedy the problem, we
equip the likelihood in Bayes' theorem with an exponent called the learning
rate, and we propose the Safe Bayesian method to learn the learning rate from
the data. SafeBayes tends to select small learning rates as soon the standard
posterior is not `cumulatively concentrated', and its results on our data are
quite encouraging.Comment: 70 pages, 20 figure
Hyperspectral Unmixing Overview: Geometrical, Statistical, and Sparse Regression-Based Approaches
Imaging spectrometers measure electromagnetic energy scattered in their
instantaneous field view in hundreds or thousands of spectral channels with
higher spectral resolution than multispectral cameras. Imaging spectrometers
are therefore often referred to as hyperspectral cameras (HSCs). Higher
spectral resolution enables material identification via spectroscopic analysis,
which facilitates countless applications that require identifying materials in
scenarios unsuitable for classical spectroscopic analysis. Due to low spatial
resolution of HSCs, microscopic material mixing, and multiple scattering,
spectra measured by HSCs are mixtures of spectra of materials in a scene. Thus,
accurate estimation requires unmixing. Pixels are assumed to be mixtures of a
few materials, called endmembers. Unmixing involves estimating all or some of:
the number of endmembers, their spectral signatures, and their abundances at
each pixel. Unmixing is a challenging, ill-posed inverse problem because of
model inaccuracies, observation noise, environmental conditions, endmember
variability, and data set size. Researchers have devised and investigated many
models searching for robust, stable, tractable, and accurate unmixing
algorithms. This paper presents an overview of unmixing methods from the time
of Keshava and Mustard's unmixing tutorial [1] to the present. Mixing models
are first discussed. Signal-subspace, geometrical, statistical, sparsity-based,
and spatial-contextual unmixing algorithms are described. Mathematical problems
and potential solutions are described. Algorithm characteristics are
illustrated experimentally.Comment: This work has been accepted for publication in IEEE Journal of
Selected Topics in Applied Earth Observations and Remote Sensin
Kernel conditional quantile estimation via reduction revisited
Quantile regression refers to the process of estimating the quantiles of a conditional distribution and has many important applications within econometrics and data mining, among other domains. In this paper, we show how to estimate these conditional quantile functions within a Bayes risk minimization framework using a Gaussian process prior. The resulting non-parametric probabilistic model is easy to implement and allows non-crossing quantile functions to be enforced. Moreover, it can directly be used in combination with tools and extensions of standard Gaussian Processes such as principled hyperparameter estimation, sparsification, and quantile regression with input-dependent noise rates. No existing approach enjoys all of these desirable properties. Experiments on benchmark datasets show that our method is competitive with state-of-the-art approaches.
Unsupervised feature learning with discriminative encoder
In recent years, deep discriminative models have achieved extraordinary
performance on supervised learning tasks, significantly outperforming their
generative counterparts. However, their success relies on the presence of a
large amount of labeled data. How can one use the same discriminative models
for learning useful features in the absence of labels? We address this question
in this paper, by jointly modeling the distribution of data and latent features
in a manner that explicitly assigns zero probability to unobserved data. Rather
than maximizing the marginal probability of observed data, we maximize the
joint probability of the data and the latent features using a two step EM-like
procedure. To prevent the model from overfitting to our initial selection of
latent features, we use adversarial regularization. Depending on the task, we
allow the latent features to be one-hot or real-valued vectors and define a
suitable prior on the features. For instance, one-hot features correspond to
class labels and are directly used for the unsupervised and semi-supervised
classification task, whereas real-valued feature vectors are fed as input to
simple classifiers for auxiliary supervised discrimination tasks. The proposed
model, which we dub discriminative encoder (or DisCoder), is flexible in the
type of latent features that it can capture. The proposed model achieves
state-of-the-art performance on several challenging tasks.Comment: 10 pages, 4 figures, International Conference on Data Mining, 201
A simple preconditioned domain decomposition method for electromagnetic scattering problems
We present a domain decomposition method (DDM) devoted to the iterative
solution of time-harmonic electromagnetic scattering problems, involving large
and resonant cavities. This DDM uses the electric field integral equation
(EFIE) for the solution of Maxwell problems in both interior and exterior
subdomains, and we propose a simple preconditioner for the global method, based
on the single layer operator restricted to the fictitious interface between the
two subdomains.Comment: 23 page
- …