6,389 research outputs found

    Near-Optimal Algorithms for Differentially-Private Principal Components

    Full text link
    Principal components analysis (PCA) is a standard tool for identifying good low-dimensional approximations to data in high dimension. Many data sets of interest contain private or sensitive information about individuals. Algorithms which operate on such data should be sensitive to the privacy risks in publishing their outputs. Differential privacy is a framework for developing tradeoffs between privacy and the utility of these outputs. In this paper we investigate the theory and empirical performance of differentially private approximations to PCA and propose a new method which explicitly optimizes the utility of the output. We show that the sample complexity of the proposed method differs from the existing procedure in the scaling with the data dimension, and that our method is nearly optimal in terms of this scaling. We furthermore illustrate our results, showing that on real data there is a large performance gap between the existing method and our method.Comment: 37 pages, 8 figures; final version to appear in the Journal of Machine Learning Research, preliminary version was at NIPS 201

    The Poisson-Dirichlet law is the unique invariant distribution for uniform split-merge transformations

    Full text link
    We consider a Markov chain on the space of (countable) partitions of the interval [0,1], obtained first by size biased sampling twice (allowing repetitions) and then merging the parts (if the sampled parts are distinct) or splitting the part uniformly (if the same part was sampled twice). We prove a conjecture of Vershik stating that the Poisson-Dirichlet law with parameter theta=1 is the unique invariant distribution for this Markov chain. Our proof uses a combination of probabilistic, combinatoric, and representation-theoretic arguments.Comment: To appear in Annals Probab. 6 figures Only change in new version is addition of proof (at end of article) that the state (1,0,0,...) is transien

    Stochastic boundary conditions for molecular dynamics simulations

    Full text link
    In this paper we develop a stochastic boundary conditions (SBC) for event-driven molecular dynamics simulations of a finite volume embedded within an infinite environment. In this method, we first collect the statistics of injection/ejection events in periodic boundary conditions (PBC). Once sufficient statistics are collected, we remove the PBC and turn on the SBC. In the SBC simulations, we allow particles leaving the system to be truly ejected from the simulation, and randomly inject particles at the boundaries by resampling from the injection/ejection statistics collected from the current or previous simulations. With the SBC, we can measure thermodynamic quantities within the grand canonical ensemble, based on the particle number and energy fluctuations. To demonstrate how useful the SBC algorithm is, we simulated a hard disk gas and measured the pair distribution function, the compressibility and the specific heat, comparing them against literature values.Comment: 24 pages, 16 figure

    Distributed (Δ+1)(\Delta+1)-Coloring in Sublogarithmic Rounds

    Full text link
    We give a new randomized distributed algorithm for (Δ+1)(\Delta+1)-coloring in the LOCAL model, running in O(log⁡Δ)+2O(log⁥log⁥n)O(\sqrt{\log \Delta})+ 2^{O(\sqrt{\log \log n})} rounds in a graph of maximum degree~Δ\Delta. This implies that the (Δ+1)(\Delta+1)-coloring problem is easier than the maximal independent set problem and the maximal matching problem, due to their lower bounds of Ω(min⁥(log⁥nlog⁥log⁥n,log⁡Δlog⁥log⁡Δ))\Omega \left( \min \left( \sqrt{\frac{\log n}{\log \log n}}, \frac{\log \Delta}{\log \log \Delta} \right) \right) by Kuhn, Moscibroda, and Wattenhofer [PODC'04]. Our algorithm also extends to list-coloring where the palette of each node contains Δ+1\Delta+1 colors. We extend the set of distributed symmetry-breaking techniques by performing a decomposition of graphs into dense and sparse parts

    Swing Dynamics as Primal-Dual Algorithm for Optimal Load Control

    Get PDF
    Frequency regulation and generation-load balancing are key issues in power transmission networks. Complementary to generation control, loads provide flexible and fast responsive sources for frequency regulation, and local frequency measurement capability of loads offers the opportunity of decentralized control. In this paper, we propose an optimal load control problem, which balances the load reduction (or increase) with the generation shortfall (or surplus), resynchronizes the bus frequencies, and minimizes a measure of aggregate disutility of participation in such a load control. We find that, a frequency-based load control coupled with the dynamics of swing equations and branch power flows serve as a distributed primal-dual algorithm to solve the optimal load control problem and its dual. Simulation shows that the proposed mechanism can restore frequency, balance load with generation and achieve the optimum of the load control problem within several seconds after a disturbance in generation. Through simulation, we also compare the performance of optimal load control with automatic generation control (AGC), and discuss the effect of their incorporation

    An exact expression to calculate the derivatives of position-dependent observables in molecular simulations with flexible constraints

    Get PDF
    In this work, we introduce an algorithm to compute the derivatives of physical observables along the constrained subspace when flexible constraints are imposed on the system (i.e., constraints in which the hard coordinates are fixed to configuration-dependent values). The presented scheme is exact, it does not contain any tunable parameter, and it only requires the calculation and inversion of a sub-block of the Hessian matrix of second derivatives of the function through which the constraints are defined. We also present a practical application to the case in which the sought observables are the Euclidean coordinates of complex molecular systems, and the function whose minimization defines the constraints is the potential energy. Finally, and in order to validate the method, which, as far as we are aware, is the first of its kind in the literature, we compare it to the natural and straightforward finite-differences approach in three molecules of biological relevance: methanol, N-methyl-acetamide and a tri-glycine peptideComment: 13 pages, 8 figures, published versio

    Relative fixed-width stopping rules for Markov chain Monte Carlo simulations

    Full text link
    Markov chain Monte Carlo (MCMC) simulations are commonly employed for estimating features of a target distribution, particularly for Bayesian inference. A fundamental challenge is determining when these simulations should stop. We consider a sequential stopping rule that terminates the simulation when the width of a confidence interval is sufficiently small relative to the size of the target parameter. Specifically, we propose relative magnitude and relative standard deviation stopping rules in the context of MCMC. In each setting, we develop sufficient conditions for asymptotic validity, that is conditions to ensure the simulation will terminate with probability one and the resulting confidence intervals will have the proper coverage probability. Our results are applicable in a wide variety of MCMC estimation settings, such as expectation, quantile, or simultaneous multivariate estimation. Finally, we investigate the finite sample properties through a variety of examples and provide some recommendations to practitioners.Comment: 24 page
    • 

    corecore