78 research outputs found
High Dimensional Low Rank plus Sparse Matrix Decomposition
This paper is concerned with the problem of low rank plus sparse matrix
decomposition for big data. Conventional algorithms for matrix decomposition
use the entire data to extract the low-rank and sparse components, and are
based on optimization problems with complexity that scales with the dimension
of the data, which limits their scalability. Furthermore, existing randomized
approaches mostly rely on uniform random sampling, which is quite inefficient
for many real world data matrices that exhibit additional structures (e.g.
clustering). In this paper, a scalable subspace-pursuit approach that
transforms the decomposition problem to a subspace learning problem is
proposed. The decomposition is carried out using a small data sketch formed
from sampled columns/rows. Even when the data is sampled uniformly at random,
it is shown that the sufficient number of sampled columns/rows is roughly
O(r\mu), where \mu is the coherency parameter and r the rank of the low rank
component. In addition, adaptive sampling algorithms are proposed to address
the problem of column/row sampling from structured data. We provide an analysis
of the proposed method with adaptive sampling and show that adaptive sampling
makes the required number of sampled columns/rows invariant to the distribution
of the data. The proposed approach is amenable to online implementation and an
online scheme is proposed.Comment: IEEE Transactions on Signal Processin
Differentially Private Model Selection with Penalized and Constrained Likelihood
In statistical disclosure control, the goal of data analysis is twofold: The
released information must provide accurate and useful statistics about the
underlying population of interest, while minimizing the potential for an
individual record to be identified. In recent years, the notion of differential
privacy has received much attention in theoretical computer science, machine
learning, and statistics. It provides a rigorous and strong notion of
protection for individuals' sensitive information. A fundamental question is
how to incorporate differential privacy into traditional statistical inference
procedures. In this paper we study model selection in multivariate linear
regression under the constraint of differential privacy. We show that model
selection procedures based on penalized least squares or likelihood can be made
differentially private by a combination of regularization and randomization,
and propose two algorithms to do so. We show that our private procedures are
consistent under essentially the same conditions as the corresponding
non-private procedures. We also find that under differential privacy, the
procedure becomes more sensitive to the tuning parameters. We illustrate and
evaluate our method using simulation studies and two real data examples
Optimal No-regret Learning in Repeated First-price Auctions
We study online learning in repeated first-price auctions with censored
feedback, where a bidder, only observing the winning bid at the end of each
auction, learns to adaptively bid in order to maximize her cumulative payoff.
To achieve this goal, the bidder faces a challenging dilemma: if she wins the
bid--the only way to achieve positive payoffs--then she is not able to observe
the highest bid of the other bidders, which we assume is iid drawn from an
unknown distribution. This dilemma, despite being reminiscent of the
exploration-exploitation trade-off in contextual bandits, cannot directly be
addressed by the existing UCB or Thompson sampling algorithms in that
literature, mainly because contrary to the standard bandits setting, when a
positive reward is obtained here, nothing about the environment can be learned.
In this paper, by exploiting the structural properties of first-price
auctions, we develop the first learning algorithm that achieves
regret bound when the bidder's private values are
stochastically generated. We do so by providing an algorithm on a general class
of problems, which we call monotone group contextual bandits, where the same
regret bound is established under stochastically generated contexts. Further,
by a novel lower bound argument, we characterize an lower
bound for the case where the contexts are adversarially generated, thus
highlighting the impact of the contexts generation mechanism on the fundamental
learning limit. Despite this, we further exploit the structure of first-price
auctions and develop a learning algorithm that operates sample-efficiently (and
computationally efficiently) in the presence of adversarially generated private
values. We establish an regret bound for this algorithm,
hence providing a complete characterization of optimal learning guarantees for
this problem
- …