422 research outputs found
Missing Data in Sparse Transition Matrix Estimation for Sub-Gaussian Vector Autoregressive Processes
High-dimensional time series data exist in numerous areas such as finance,
genomics, healthcare, and neuroscience. An unavoidable aspect of all such
datasets is missing data, and dealing with this issue has been an important
focus in statistics, control, and machine learning. In this work, we consider a
high-dimensional estimation problem where a dynamical system, governed by a
stable vector autoregressive model, is randomly and only partially observed at
each time point. Our task amounts to estimating the transition matrix, which is
assumed to be sparse. In such a scenario, where covariates are highly
interdependent and partially missing, new theoretical challenges arise. While
transition matrix estimation in vector autoregressive models has been studied
previously, the missing data scenario requires separate efforts. Moreover,
while transition matrix estimation can be studied from a high-dimensional
sparse linear regression perspective, the covariates are highly dependent and
existing results on regularized estimation with missing data from
i.i.d.~covariates are not applicable. At the heart of our analysis lies 1) a
novel concentration result when the innovation noise satisfies the convex
concentration property, as well as 2) a new quantity for characterizing the
interactions of the time-varying observation process with the underlying
dynamical system
Regularized Non-Gaussian Image Denoising
In image denoising problems, one widely-adopted approach is to minimize a
regularized data-fit objective function, where the data-fit term is derived
from a physical image acquisition model. Typically the regularizer is selected
with two goals in mind: (a) to accurately reflect image structure, such as
smoothness or sparsity, and (b) to ensure that the resulting optimization
problem is convex and can be solved efficiently. The space of such regularizers
in Gaussian noise settings is well studied; however, non-Gaussian noise models
lead to data-fit expressions for which entirely different families of
regularizers may be effective. These regularizers have received less attention
in the literature because they yield non-convex optimization problems in
Gaussian noise settings. This paper describes such regularizers and a simple
reparameterization approach that allows image reconstruction to be accomplished
using efficient convex optimization tools. The proposed approach essentially
modifies the objective function to facilitate taking advantage of tools such as
proximal denoising routines. We present examples of imaging denoising under
exponential family (Bernoulli and Poisson) and multiplicative noise.Comment: 22 page
Sparse Linear Regression With Missing Data
This paper proposes a fast and accurate method for sparse regression in the
presence of missing data. The underlying statistical model encapsulates the
low-dimensional structure of the incomplete data matrix and the sparsity of the
regression coefficients, and the proposed algorithm jointly learns the
low-dimensional structure of the data and a linear regressor with sparse
coefficients. The proposed stochastic optimization method, Sparse Linear
Regression with Missing Data (SLRM), performs an alternating minimization
procedure and scales well with the problem size. Large deviation inequalities
shed light on the impact of the various problem-dependent parameters on the
expected squared loss of the learned regressor. Extensive simulations on both
synthetic and real datasets show that SLRM performs better than competing
algorithms in a variety of contexts.Comment: 14 pages, 7 figure
Minimax Optimal Rates for Poisson Inverse Problems with Physical Constraints
This paper considers fundamental limits for solving sparse inverse problems
in the presence of Poisson noise with physical constraints. Such problems arise
in a variety of applications, including photon-limited imaging systems based on
compressed sensing. Most prior theoretical results in compressed sensing and
related inverse problems apply to idealized settings where the noise is i.i.d.,
and do not account for signal-dependent noise and physical sensing constraints.
Prior results on Poisson compressed sensing with signal-dependent noise and
physical constraints provided upper bounds on mean squared error performance
for a specific class of estimators. However, it was unknown whether those
bounds were tight or if other estimators could achieve significantly better
performance. This work provides minimax lower bounds on mean-squared error for
sparse Poisson inverse problems under physical constraints. Our lower bounds
are complemented by minimax upper bounds. Our upper and lower bounds reveal
that due to the interplay between the Poisson noise model, the sparsity
constraint and the physical constraints: (i) the mean-squared error does not
depend on the sample size other than to ensure the sensing matrix satisfies
RIP-like conditions and the intensity of the input signal plays a critical
role; and (ii) the mean-squared error has two distinct regimes, a low-intensity
and a high-intensity regime and the transition point from the low-intensity to
high-intensity regime depends on the input signal . In the low-intensity
regime the mean-squared error is independent of while in the high-intensity
regime, the mean-squared error scales as , where is the
sparsity level, is the number of pixels or parameters and is the signal
intensity.Comment: 30 pages, 5 figure
Relax but stay in control: from value to algorithms for online Markov decision processes
Online learning algorithms are designed to perform in non-stationary
environments, but generally there is no notion of a dynamic state to model
constraints on current and future actions as a function of past actions.
State-based models are common in stochastic control settings, but commonly used
frameworks such as Markov Decision Processes (MDPs) assume a known stationary
environment. In recent years, there has been a growing interest in combining
the above two frameworks and considering an MDP setting in which the cost
function is allowed to change arbitrarily after each time step. However, most
of the work in this area has been algorithmic: given a problem, one would
develop an algorithm almost from scratch. Moreover, the presence of the state
and the assumption of an arbitrarily varying environment complicate both the
theoretical analysis and the development of computationally efficient methods.
This paper describes a broad extension of the ideas proposed by Rakhlin et al.
to give a general framework for deriving algorithms in an MDP setting with
arbitrarily changing costs. This framework leads to a unifying view of existing
methods and provides a general procedure for constructing new ones. Several new
methods are presented, and one of them is shown to have important advantages
over a similar method developed from scratch via an online version of
approximate dynamic programming.Comment: 40 pages; additional results in the convex-analytic framewor
Estimating Network Structure from Incomplete Event Data
Multivariate Bernoulli autoregressive (BAR) processes model time series of
events in which the likelihood of current events is determined by the times and
locations of past events. These processes can be used to model nonlinear
dynamical systems corresponding to criminal activity, responses of patients to
different medical treatment plans, opinion dynamics across social networks,
epidemic spread, and more. Past work examines this problem under the assumption
that the event data is complete, but in many cases only a fraction of events
are observed. Incomplete observations pose a significant challenge in this
setting because the unobserved events still govern the underlying dynamical
system. In this work, we develop a novel approach to estimating the parameters
of a BAR process in the presence of unobserved events via an unbiased estimator
of the complete data log-likelihood function. We propose a computationally
efficient estimation algorithm which approximates this estimator via Taylor
series truncation and establish theoretical results for both the statistical
error and optimization error of our algorithm. We further justify our approach
by testing our method on both simulated data and a real data set consisting of
crimes recorded by the city of Chicago
Subspace Clustering with Missing and Corrupted Data
Given full or partial information about a collection of points that lie close
to a union of several subspaces, subspace clustering refers to the process of
clustering the points according to their subspace and identifying the
subspaces. One popular approach, sparse subspace clustering (SSC), represents
each sample as a weighted combination of the other samples, with weights of
minimal norm, and then uses those learned weights to cluster the
samples. SSC is stable in settings where each sample is contaminated by a
relatively small amount of noise. However, when there is a significant amount
of additive noise, or a considerable number of entries are missing, theoretical
guarantees are scarce. In this paper, we study a robust variant of SSC and
establish clustering guarantees in the presence of corrupted or missing data.
We give explicit bounds on amount of noise and missing data that the algorithm
can tolerate, both in deterministic settings and in a random generative model.
Notably, our approach provides guarantees for higher tolerance to noise and
missing data than existing analyses for this method. By design, the results
hold even when we do not know the locations of the missing data; e.g., as in
presence-only data.Comment: 31 pages, 2 figure
Dynamical Models and Tracking Regret in Online Convex Programming
This paper describes a new online convex optimization method which
incorporates a family of candidate dynamical models and establishes novel
tracking regret bounds that scale with the comparator's deviation from the best
dynamical model in this family. Previous online optimization methods are
designed to have a total accumulated loss comparable to that of the best
comparator sequence, and existing tracking or shifting regret bounds scale with
the overall variation of the comparator sequence. In many practical scenarios,
however, the environment is nonstationary and comparator sequences with small
variation are quite weak, resulting in large losses. The proposed Dynamic
Mirror Descent method, in contrast, can yield low regret relative to highly
variable comparator sequences by both tracking the best dynamical model and
forming predictions based on that model. This concept is demonstrated
empirically in the context of sequential compressive observations of a dynamic
scene and tracking a dynamic social network.Comment: To appear in ICML 201
Matrix Completion Under Monotonic Single Index Models
Most recent results in matrix completion assume that the matrix under
consideration is low-rank or that the columns are in a union of low-rank
subspaces. In real-world settings, however, the linear structure underlying
these models is distorted by a (typically unknown) nonlinear transformation.
This paper addresses the challenge of matrix completion in the face of such
nonlinearities. Given a few observations of a matrix that are obtained by
applying a Lipschitz, monotonic function to a low rank matrix, our task is to
estimate the remaining unobserved entries. We propose a novel matrix completion
method that alternates between low-rank matrix estimation and monotonic
function estimation to estimate the missing matrix elements. Mean squared error
bounds provide insight into how well the matrix can be estimated based on the
size, rank of the matrix and properties of the nonlinear transformation.
Empirical results on synthetic and real-world datasets demonstrate the
competitiveness of the proposed approach.Comment: 21 pages, 5 figures, 1 table. Accepted for publication at NIPS 201
Online Markov decision processes with Kullback-Leibler control cost
This paper considers an online (real-time) control problem that involves an
agent performing a discrete-time random walk over a finite state space. The
agent's action at each time step is to specify the probability distribution for
the next state given the current state. Following the set-up of Todorov, the
state-action cost at each time step is a sum of a state cost and a control cost
given by the Kullback-Leibler (KL) divergence between the agent's next-state
distribution and that determined by some fixed passive dynamics. The online
aspect of the problem is due to the fact that the state cost functions are
generated by a dynamic environment, and the agent learns the current state cost
only after selecting an action. An explicit construction of a computationally
efficient strategy with small regret (i.e., expected difference between its
actual total cost and the smallest cost attainable using noncausal knowledge of
the state costs) under mild regularity conditions is presented, along with a
demonstration of the performance of the proposed strategy on a simulated target
tracking problem. A number of new results on Markov decision processes with KL
control cost are also obtained.Comment: to appear in IEEE Transactions on Automatic Contro
- β¦