Search CORE

6,531 research outputs found

Near-Optimal Algorithms for Differentially-Private Principal Components

Author: Chaudhuri Kamalika
Sarwate Anand D.
Sinha Kaushik
Publication venue
Publication date: 07/08/2013
Field of study

Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of Machine Learning Research, preliminary version was at NIPS 201

arXiv.org e-Print Archive

CiteSeerX

Shocker Open Access Repository

The Poisson-Dirichlet law is the unique invariant distribution for uniform split-merge transformations

Author: Diaconis Persi
Mayer-Wolf Eddy
Zeitouni Ofer
Zerner Martin
Publication venue
Publication date: 01/01/2003
Field of study

We consider a Markov chain on the space of (countable) partitions of the interval [0,1], obtained first by size biased sampling twice (allowing repetitions) and then merging the parts (if the sampled parts are distinct) or splitting the part uniformly (if the same part was sampled twice). We prove a conjecture of Vershik stating that the Poisson-Dirichlet law with parameter theta=1 is the unique invariant distribution for this Markov chain. Our proof uses a combination of probabilistic, combinatoric, and representation-theoretic arguments.Comment: To appear in Annals Probab. 6 figures Only change in new version is addition of proof (at end of article) that the state (1,0,0,...) is transien

arXiv.org e-Print Archive

CiteSeerX

Stochastic boundary conditions for molecular dynamics simulations

Author: Cheong Siew Ann
Chong Shan Shang
Leaw Jianing
Prusty Manamohan
Publication venue: 'Elsevier BV'
Publication date: 01/01/2011
Field of study

In this paper we develop a stochastic boundary conditions (SBC) for event-driven molecular dynamics simulations of a finite volume embedded within an infinite environment. In this method, we first collect the statistics of injection/ejection events in periodic boundary conditions (PBC). Once sufficient statistics are collected, we remove the PBC and turn on the SBC. In the SBC simulations, we allow particles leaving the system to be truly ejected from the simulation, and randomly inject particles at the boundaries by resampling from the injection/ejection statistics collected from the current or previous simulations. With the SBC, we can measure thermodynamic quantities within the grand canonical ensemble, based on the particle number and energy fluctuations. To demonstrate how useful the SBC algorithm is, we simulated a hard disk gas and measured the pair distribution function, the compressibility and the specific heat, comparing them against literature values.Comment: 24 pages, 16 figure

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Distributed $(\Delta+1)$ -Coloring in Sublogarithmic Rounds

Author: Harris David G.
Schneider Johannes
Su Hsin-Hao
Publication venue
Publication date: 17/01/2018
Field of study

We give a new randomized distributed algorithm for

(\Delta+1)

-coloring in the LOCAL model, running in

O(\sqrt{\log \Delta})+ 2^{O(\sqrt{\log \log n})}

rounds in a graph of maximum degree~

\Delta

. This implies that the

(\Delta+1)

-coloring problem is easier than the maximal independent set problem and the maximal matching problem, due to their lower bounds of

\Omega \left( \min \left( \sqrt{\frac{\log n}{\log \log n}}, \frac{\log \Delta}{\log \log \Delta} \right) \right)

by Kuhn, Moscibroda, and Wattenhofer [PODC'04]. Our algorithm also extends to list-coloring where the palette of each node contains

\Delta+1

colors. We extend the set of distributed symmetry-breaking techniques by performing a decomposition of graphs into dense and sparse parts

arXiv.org e-Print Archive

Swing Dynamics as Primal-Dual Algorithm for Optimal Load Control

Author: Low Steven H.
Topcu Ufuk
Zhao Changhong
Publication venue
Publication date: 01/01/2012
Field of study

Frequency regulation and generation-load balancing are key issues in power transmission networks. Complementary to generation control, loads provide flexible and fast responsive sources for frequency regulation, and local frequency measurement capability of loads offers the opportunity of decentralized control. In this paper, we propose an optimal load control problem, which balances the load reduction (or increase) with the generation shortfall (or surplus), resynchronizes the bus frequencies, and minimizes a measure of aggregate disutility of participation in such a load control. We find that, a frequency-based load control coupled with the dynamics of swing equations and branch power flows serve as a distributed primal-dual algorithm to solve the optimal load control problem and its dual. Simulation shows that the proposed mechanism can restore frequency, balance load with generation and achieve the optimum of the load control problem within several seconds after a disturbance in generation. Through simulation, we also compare the performance of optimal load control with automatic generation control (AGC), and discuss the effect of their incorporation

CiteSeerX

Crossref

Caltech Authors

An exact expression to calculate the derivatives of position-dependent observables in molecular simulations with flexible constraints

Author: B Hess
BA Dubrovin
BR Brooks
CJ Cotter
Claudio N. Cavasotto
D Frenkel
DA Case
DA Pearlman
Darren R. Flower
DC Rapaport
E Helfand
EA Carter
EW Weisstein
F Jensen
H Goldstein
J Hutter
J Zhou
J.L. Alonso
JL Alonso
JP Ryckaert
JW Eastwood
JW Ponder
L Greengard
M Christen
M Christen
MJ Frisch
Monica De Marco
MP Allen
N Gō
NG Van Kampen
P Echenique
P Echenique
P Echenique
P Echenique
P Echenique
P Echenique
P Kollman
P Pechukas
Pablo Echenique
Pablo Garca-Risueño
R Car
T Schlick
TA Darden
WD Cornell
WH Press
WL Jorgensen
WL Jorgensen
X Andrade
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 12/09/2011
Field of study

In this work, we introduce an algorithm to compute the derivatives of physical observables along the constrained subspace when flexible constraints are imposed on the system (i.e., constraints in which the hard coordinates are fixed to configuration-dependent values). The presented scheme is exact, it does not contain any tunable parameter, and it only requires the calculation and inversion of a sub-block of the Hessian matrix of second derivatives of the function through which the constraints are defined. We also present a practical application to the case in which the sought observables are the Euclidean coordinates of complex molecular systems, and the function whose minimization defines the constraints is the potential energy. Finally, and in order to validate the method, which, as far as we are aware, is the first of its kind in the literature, we compare it to the natural and straightforward finite-differences approach in three molecules of biological relevance: methanol, N-methyl-acetamide and a tri-glycine peptideComment: 13 pages, 8 figures, published versio

arXiv.org e-Print Archive

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

Digital.CSIC

Relative fixed-width stopping rules for Markov chain Monte Carlo simulations

Author: Flegal James M.
Gong Lei
Publication venue
Publication date: 01/03/2013
Field of study

Markov chain Monte Carlo (MCMC) simulations are commonly employed for estimating features of a target distribution, particularly for Bayesian inference. A fundamental challenge is determining when these simulations should stop. We consider a sequential stopping rule that terminates the simulation when the width of a confidence interval is sufficiently small relative to the size of the target parameter. Specifically, we propose relative magnitude and relative standard deviation stopping rules in the context of MCMC. In each setting, we develop sufficient conditions for asymptotic validity, that is conditions to ensure the simulation will terminate with probability one and the resulting confidence intervals will have the proper coverage probability. Our results are applicable in a wide variety of MCMC estimation settings, such as expectation, quantile, or simultaneous multivariate estimation. Finally, we investigate the finite sample properties through a variety of examples and provide some recommendations to practitioners.Comment: 24 page

arXiv.org e-Print Archive

CiteSeerX