1,084 research outputs found
A Tensor-Based Dictionary Learning Approach to Tomographic Image Reconstruction
We consider tomographic reconstruction using priors in the form of a
dictionary learned from training images. The reconstruction has two stages:
first we construct a tensor dictionary prior from our training data, and then
we pose the reconstruction problem in terms of recovering the expansion
coefficients in that dictionary. Our approach differs from past approaches in
that a) we use a third-order tensor representation for our images and b) we
recast the reconstruction problem using the tensor formulation. The dictionary
learning problem is presented as a non-negative tensor factorization problem
with sparsity constraints. The reconstruction problem is formulated in a convex
optimization framework by looking for a solution with a sparse representation
in the tensor dictionary. Numerical results show that our tensor formulation
leads to very sparse representations of both the training images and the
reconstructions due to the ability of representing repeated features compactly
in the dictionary.Comment: 29 page
Sampling constrained probability distributions using Spherical Augmentation
Statistical models with constrained probability distributions are abundant in
machine learning. Some examples include regression models with norm constraints
(e.g., Lasso), probit, many copula models, and latent Dirichlet allocation
(LDA). Bayesian inference involving probability distributions confined to
constrained domains could be quite challenging for commonly used sampling
algorithms. In this paper, we propose a novel augmentation technique that
handles a wide range of constraints by mapping the constrained domain to a
sphere in the augmented space. By moving freely on the surface of this sphere,
sampling algorithms handle constraints implicitly and generate proposals that
remain within boundaries when mapped back to the original space. Our proposed
method, called {Spherical Augmentation}, provides a mathematically natural and
computationally efficient framework for sampling from constrained probability
distributions. We show the advantages of our method over state-of-the-art
sampling algorithms, such as exact Hamiltonian Monte Carlo, using several
examples including truncated Gaussian distributions, Bayesian Lasso, Bayesian
bridge regression, reconstruction of quantized stationary Gaussian process, and
LDA for topic modeling.Comment: 41 pages, 13 figure
The Augmented Synthetic Control Method
The synthetic control method (SCM) is a popular approach for estimating the
impact of a treatment on a single unit in panel data settings. The "synthetic
control" is a weighted average of control units that balances the treated
unit's pre-treatment outcomes as closely as possible. A critical feature of the
original proposal is to use SCM only when the fit on pre-treatment outcomes is
excellent. We propose Augmented SCM as an extension of SCM to settings where
such pre-treatment fit is infeasible. Analogous to bias correction for inexact
matching, Augmented SCM uses an outcome model to estimate the bias due to
imperfect pre-treatment fit and then de-biases the original SCM estimate. Our
main proposal, which uses ridge regression as the outcome model, directly
controls pre-treatment fit while minimizing extrapolation from the convex hull.
This estimator can also be expressed as a solution to a modified synthetic
controls problem that allows negative weights on some donor units. We bound the
estimation error of this approach under different data generating processes,
including a linear factor model, and show how regularization helps to avoid
over-fitting to noise. We demonstrate gains from Augmented SCM with extensive
simulation studies and apply this framework to estimate the impact of the 2012
Kansas tax cuts on economic growth. We implement the proposed method in the new
augsynth R package
Similarity Learning via Kernel Preserving Embedding
Data similarity is a key concept in many data-driven applications. Many
algorithms are sensitive to similarity measures. To tackle this fundamental
problem, automatically learning of similarity information from data via
self-expression has been developed and successfully applied in various models,
such as low-rank representation, sparse subspace learning, semi-supervised
learning. However, it just tries to reconstruct the original data and some
valuable information, e.g., the manifold structure, is largely ignored. In this
paper, we argue that it is beneficial to preserve the overall relations when we
extract similarity information. Specifically, we propose a novel similarity
learning framework by minimizing the reconstruction error of kernel matrices,
rather than the reconstruction error of original data adopted by existing work.
Taking the clustering task as an example to evaluate our method, we observe
considerable improvements compared to other state-of-the-art methods. More
importantly, our proposed framework is very general and provides a novel and
fundamental building block for many other similarity-based tasks. Besides, our
proposed kernel preserving opens up a large number of possibilities to embed
high-dimensional data into low-dimensional space.Comment: Published in AAAI 201
- …