6,531 research outputs found
Near-Optimal Algorithms for Differentially-Private Principal Components
Principal components analysis (PCA) is a standard tool for identifying good
low-dimensional approximations to data in high dimension. Many data sets of
interest contain private or sensitive information about individuals. Algorithms
which operate on such data should be sensitive to the privacy risks in
publishing their outputs. Differential privacy is a framework for developing
tradeoffs between privacy and the utility of these outputs. In this paper we
investigate the theory and empirical performance of differentially private
approximations to PCA and propose a new method which explicitly optimizes the
utility of the output. We show that the sample complexity of the proposed
method differs from the existing procedure in the scaling with the data
dimension, and that our method is nearly optimal in terms of this scaling. We
furthermore illustrate our results, showing that on real data there is a large
performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of
Machine Learning Research, preliminary version was at NIPS 201
The Poisson-Dirichlet law is the unique invariant distribution for uniform split-merge transformations
We consider a Markov chain on the space of (countable) partitions of the
interval [0,1], obtained first by size biased sampling twice (allowing
repetitions) and then merging the parts (if the sampled parts are distinct) or
splitting the part uniformly (if the same part was sampled twice). We prove a
conjecture of Vershik stating that the Poisson-Dirichlet law with parameter
theta=1 is the unique invariant distribution for this Markov chain.
Our proof uses a combination of probabilistic, combinatoric, and
representation-theoretic arguments.Comment: To appear in Annals Probab. 6 figures Only change in new version is
addition of proof (at end of article) that the state (1,0,0,...) is transien
Stochastic boundary conditions for molecular dynamics simulations
In this paper we develop a stochastic boundary conditions (SBC) for
event-driven molecular dynamics simulations of a finite volume embedded within
an infinite environment. In this method, we first collect the statistics of
injection/ejection events in periodic boundary conditions (PBC). Once
sufficient statistics are collected, we remove the PBC and turn on the SBC. In
the SBC simulations, we allow particles leaving the system to be truly ejected
from the simulation, and randomly inject particles at the boundaries by
resampling from the injection/ejection statistics collected from the current or
previous simulations. With the SBC, we can measure thermodynamic quantities
within the grand canonical ensemble, based on the particle number and energy
fluctuations. To demonstrate how useful the SBC algorithm is, we simulated a
hard disk gas and measured the pair distribution function, the compressibility
and the specific heat, comparing them against literature values.Comment: 24 pages, 16 figure
Distributed -Coloring in Sublogarithmic Rounds
We give a new randomized distributed algorithm for -coloring in
the LOCAL model, running in
rounds in a graph of maximum degree~. This implies that the
-coloring problem is easier than the maximal independent set
problem and the maximal matching problem, due to their lower bounds of by Kuhn, Moscibroda, and Wattenhofer [PODC'04].
Our algorithm also extends to list-coloring where the palette of each node
contains colors. We extend the set of distributed symmetry-breaking
techniques by performing a decomposition of graphs into dense and sparse parts
Swing Dynamics as Primal-Dual Algorithm for Optimal Load Control
Frequency regulation and generation-load balancing are key issues in power transmission networks. Complementary to generation control, loads provide flexible and fast responsive sources for frequency regulation, and local frequency measurement capability of loads offers the opportunity of decentralized control. In this paper, we propose an optimal load control problem, which balances the load reduction (or increase) with the generation shortfall (or surplus), resynchronizes the bus frequencies, and minimizes a measure of aggregate disutility of participation in such a load control. We find that, a frequency-based load control coupled with the dynamics of swing equations and branch power flows serve as a distributed primal-dual algorithm to solve the optimal load control problem and its dual. Simulation shows that the proposed mechanism can restore frequency, balance load with generation and achieve the optimum of the load control problem within several seconds after a disturbance in generation. Through simulation, we also compare the performance of optimal load control with automatic generation control (AGC), and discuss the effect of their incorporation
An exact expression to calculate the derivatives of position-dependent observables in molecular simulations with flexible constraints
In this work, we introduce an algorithm to compute the derivatives of
physical observables along the constrained subspace when flexible constraints
are imposed on the system (i.e., constraints in which the hard coordinates are
fixed to configuration-dependent values). The presented scheme is exact, it
does not contain any tunable parameter, and it only requires the calculation
and inversion of a sub-block of the Hessian matrix of second derivatives of the
function through which the constraints are defined. We also present a practical
application to the case in which the sought observables are the Euclidean
coordinates of complex molecular systems, and the function whose minimization
defines the constraints is the potential energy. Finally, and in order to
validate the method, which, as far as we are aware, is the first of its kind in
the literature, we compare it to the natural and straightforward
finite-differences approach in three molecules of biological relevance:
methanol, N-methyl-acetamide and a tri-glycine peptideComment: 13 pages, 8 figures, published versio
Relative fixed-width stopping rules for Markov chain Monte Carlo simulations
Markov chain Monte Carlo (MCMC) simulations are commonly employed for
estimating features of a target distribution, particularly for Bayesian
inference. A fundamental challenge is determining when these simulations should
stop. We consider a sequential stopping rule that terminates the simulation
when the width of a confidence interval is sufficiently small relative to the
size of the target parameter. Specifically, we propose relative magnitude and
relative standard deviation stopping rules in the context of MCMC. In each
setting, we develop sufficient conditions for asymptotic validity, that is
conditions to ensure the simulation will terminate with probability one and the
resulting confidence intervals will have the proper coverage probability. Our
results are applicable in a wide variety of MCMC estimation settings, such as
expectation, quantile, or simultaneous multivariate estimation. Finally, we
investigate the finite sample properties through a variety of examples and
provide some recommendations to practitioners.Comment: 24 page
- âŠ