8,550 research outputs found
The Noisy Power Method: A Meta Algorithm with Applications
We provide a new robust convergence analysis of the well-known power method
for computing the dominant singular vectors of a matrix that we call the noisy
power method. Our result characterizes the convergence behavior of the
algorithm when a significant amount noise is introduced after each
matrix-vector multiplication. The noisy power method can be seen as a
meta-algorithm that has recently found a number of important applications in a
broad range of machine learning problems including alternating minimization for
matrix completion, streaming principal component analysis (PCA), and
privacy-preserving spectral analysis. Our general analysis subsumes several
existing ad-hoc convergence bounds and resolves a number of open problems in
multiple applications including streaming PCA and privacy-preserving singular
vector computation.Comment: NIPS 201
Private Incremental Regression
Data is continuously generated by modern data sources, and a recent challenge
in machine learning has been to develop techniques that perform well in an
incremental (streaming) setting. In this paper, we investigate the problem of
private machine learning, where as common in practice, the data is not given at
once, but rather arrives incrementally over time.
We introduce the problems of private incremental ERM and private incremental
regression where the general goal is to always maintain a good empirical risk
minimizer for the history observed under differential privacy. Our first
contribution is a generic transformation of private batch ERM mechanisms into
private incremental ERM mechanisms, based on a simple idea of invoking the
private batch ERM procedure at some regular time intervals. We take this
construction as a baseline for comparison. We then provide two mechanisms for
the private incremental regression problem. Our first mechanism is based on
privately constructing a noisy incremental gradient function, which is then
used in a modified projected gradient procedure at every timestep. This
mechanism has an excess empirical risk of , where is the
dimensionality of the data. While from the results of [Bassily et al. 2014]
this bound is tight in the worst-case, we show that certain geometric
properties of the input and constraint set can be used to derive significantly
better results for certain interesting regression problems.Comment: To appear in PODS 201
MVG Mechanism: Differential Privacy under Matrix-Valued Query
Differential privacy mechanism design has traditionally been tailored for a
scalar-valued query function. Although many mechanisms such as the Laplace and
Gaussian mechanisms can be extended to a matrix-valued query function by adding
i.i.d. noise to each element of the matrix, this method is often suboptimal as
it forfeits an opportunity to exploit the structural characteristics typically
associated with matrix analysis. To address this challenge, we propose a novel
differential privacy mechanism called the Matrix-Variate Gaussian (MVG)
mechanism, which adds a matrix-valued noise drawn from a matrix-variate
Gaussian distribution, and we rigorously prove that the MVG mechanism preserves
-differential privacy. Furthermore, we introduce the concept
of directional noise made possible by the design of the MVG mechanism.
Directional noise allows the impact of the noise on the utility of the
matrix-valued query function to be moderated. Finally, we experimentally
demonstrate the performance of our mechanism using three matrix-valued queries
on three privacy-sensitive datasets. We find that the MVG mechanism notably
outperforms four previous state-of-the-art approaches, and provides comparable
utility to the non-private baseline.Comment: Appeared in CCS'1
An Adaptive Mechanism for Accurate Query Answering under Differential Privacy
We propose a novel mechanism for answering sets of count- ing queries under
differential privacy. Given a workload of counting queries, the mechanism
automatically selects a different set of "strategy" queries to answer
privately, using those answers to derive answers to the workload. The main
algorithm proposed in this paper approximates the optimal strategy for any
workload of linear counting queries. With no cost to the privacy guarantee, the
mechanism improves significantly on prior approaches and achieves near-optimal
error for many workloads, when applied under (\epsilon, \delta)-differential
privacy. The result is an adaptive mechanism which can help users achieve good
utility without requiring that they reason carefully about the best formulation
of their task.Comment: VLDB2012. arXiv admin note: substantial text overlap with
arXiv:1103.136
Wishart Mechanism for Differentially Private Principal Components Analysis
We propose a new input perturbation mechanism for publishing a covariance
matrix to achieve -differential privacy. Our mechanism uses a
Wishart distribution to generate matrix noise. In particular, We apply this
mechanism to principal component analysis. Our mechanism is able to keep the
positive semi-definiteness of the published covariance matrix. Thus, our
approach gives rise to a general publishing framework for input perturbation of
a symmetric positive semidefinite matrix. Moreover, compared with the classic
Laplace mechanism, our method has better utility guarantee. To the best of our
knowledge, Wishart mechanism is the best input perturbation approach for
-differentially private PCA. We also compare our work with
previous exponential mechanism algorithms in the literature and provide near
optimal bound while having more flexibility and less computational
intractability.Comment: A full version with technical proofs. Accepted to AAAI-1
The ECMWF Ensemble Prediction System: Looking Back (more than) 25 Years and Projecting Forward 25 Years
This paper has been written to mark 25 years of operational medium-range
ensemble forecasting. The origins of the ECMWF Ensemble Prediction System are
outlined, including the development of the precursor real-time Met Office
monthly ensemble forecast system. In particular, the reasons for the
development of singular vectors and stochastic physics - particular features of
the ECMWF Ensemble Prediction System - are discussed. The author speculates
about the development and use of ensemble prediction in the next 25 years.Comment: Submitted to Special Issue of the Quarterly Journal of the Royal
Meteorological Society: 25 years of ensemble predictio
Near-Optimal Algorithms for Differentially-Private Principal Components
Principal components analysis (PCA) is a standard tool for identifying good
low-dimensional approximations to data in high dimension. Many data sets of
interest contain private or sensitive information about individuals. Algorithms
which operate on such data should be sensitive to the privacy risks in
publishing their outputs. Differential privacy is a framework for developing
tradeoffs between privacy and the utility of these outputs. In this paper we
investigate the theory and empirical performance of differentially private
approximations to PCA and propose a new method which explicitly optimizes the
utility of the output. We show that the sample complexity of the proposed
method differs from the existing procedure in the scaling with the data
dimension, and that our method is nearly optimal in terms of this scaling. We
furthermore illustrate our results, showing that on real data there is a large
performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of
Machine Learning Research, preliminary version was at NIPS 201
- …