97 research outputs found

    Ergodicity of Random Walks on Random DFA

    Full text link
    Given a DFA we consider the random walk that starts at the initial state and at each time step moves to a new state by taking a random transition from the current state. This paper shows that for typical DFA this random walk induces an ergodic Markov chain. The notion of typical DFA is formalized by showing that ergodicity holds with high probability when a DFA is sampled uniformly at random from the set of all automata with a fixed number of states. We also show the same result applies to DFA obtained by minimizing typical DFA

    Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising

    Full text link
    The Gaussian mechanism is an essential building block used in multitude of differentially private data analysis algorithms. In this paper we revisit the Gaussian mechanism and show that the original analysis has several important limitations. Our analysis reveals that the variance formula for the original mechanism is far from tight in the high privacy regime (Ξ΅β†’0\varepsilon \to 0) and it cannot be extended to the low privacy regime (Ξ΅β†’βˆž\varepsilon \to \infty). We address these limitations by developing an optimal Gaussian mechanism whose variance is calibrated directly using the Gaussian cumulative density function instead of a tail bound approximation. We also propose to equip the Gaussian mechanism with a post-processing step based on adaptive estimation techniques by leveraging that the distribution of the perturbation is known. Our experiments show that analytical calibration removes at least a third of the variance of the noise compared to the classical Gaussian mechanism, and that denoising dramatically improves the accuracy of the Gaussian mechanism in the high-dimensional regime.Comment: To appear at the 35th International Conference on Machine Learning (ICML), 201

    Differentially Private Policy Evaluation

    Full text link
    We present the first differentially private algorithms for reinforcement learning, which apply to the task of evaluating a fixed policy. We establish two approaches for achieving differential privacy, provide a theoretical analysis of the privacy and utility of the two algorithms, and show promising results on simple empirical examples

    Singular value automata and approximate minimization

    Full text link
    The present paper uses spectral theory of linear operators to construct approximately minimal realizations of weighted languages. Our new contributions are: (i) a new algorithm for the SVD decomposition of infinite Hankel matrices based on their representation in terms of weighted automata, (ii) a new canonical form for weighted automata arising from the SVD of its corresponding Hankel matrix and (iii) an algorithm to construct approximate minimizations of given weighted automata by truncating the canonical form. We give bounds on the quality of our approximation

    A Canonical Form for Weighted Automata and Applications to Approximate Minimization

    Full text link
    We study the problem of constructing approximations to a weighted automaton. Weighted finite automata (WFA) are closely related to the theory of rational series. A rational series is a function from strings to real numbers that can be computed by a finite WFA. Among others, this includes probability distributions generated by hidden Markov models and probabilistic automata. The relationship between rational series and WFA is analogous to the relationship between regular languages and ordinary automata. Associated with such rational series are infinite matrices called Hankel matrices which play a fundamental role in the theory of minimal WFA. Our contributions are: (1) an effective procedure for computing the singular value decomposition (SVD) of such infinite Hankel matrices based on their representation in terms of finite WFA; (2) a new canonical form for finite WFA based on this SVD decomposition; and, (3) an algorithm to construct approximate minimizations of a given WFA. The goal of our approximate minimization algorithm is to start from a minimal WFA and produce a smaller WFA that is close to the given one in a certain sense. The desired size of the approximating automaton is given as input. We give bounds describing how well the approximation emulates the behavior of the original WFA

    Subsampled R\'enyi Differential Privacy and Analytical Moments Accountant

    Full text link
    We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms. Specifically, we provide a tight upper bound on the R\'enyi Differential Privacy (RDP) (Mironov, 2017) parameters for algorithms that: (1) subsample the dataset, and then (2) applies a randomized mechanism M to the subsample, in terms of the RDP parameters of M and the subsampling probability parameter. Our results generalize the moments accounting technique, developed by Abadi et al. (2016) for the Gaussian mechanism, to any subsampled RDP mechanism

    Diameter and Stationary Distribution of Random rr-out Digraphs

    Full text link
    Let D(n,r)D(n,r) be a random rr-out regular directed multigraph on the set of vertices {1,…,n}\{1,\ldots,n\}. In this work, we establish that for every rβ‰₯2r \ge 2, there exists Ξ·r>0\eta_r>0 such that diam(D(n,r))=(1+Ξ·r+o(1))log⁑rn\text{diam}(D(n,r))=(1+\eta_r+o(1))\log_r{n}. Our techniques also allow us to bound some extremal quantities related to the stationary distribution of a simple random walk on D(n,r)D(n,r). In particular, we determine the asymptotic behaviour of Ο€max⁑\pi_{\max} and Ο€min⁑\pi_{\min}, the maximum and the minimum values of the stationary distribution. We show that with high probability Ο€max⁑=nβˆ’1+o(1)\pi_{\max} = n^{-1+o(1)} and Ο€min⁑=nβˆ’(1+Ξ·r)+o(1)\pi_{\min}=n^{-(1+\eta_r)+o(1)}. Our proof shows that the vertices with Ο€(v)\pi(v) near to Ο€min⁑\pi_{\min} lie at the top of "narrow, slippery towers", such vertices are also responsible for increasing the diameter from (1+o(1))log⁑rn(1+o(1))\log_r n to (1+Ξ·r+o(1))log⁑rn(1+\eta_r+o(1))\log_r{n}.Comment: 31 page

    Privacy Amplification by Mixing and Diffusion Mechanisms

    Full text link
    A fundamental result in differential privacy states that the privacy guarantees of a mechanism are preserved by any post-processing of its output. In this paper we investigate under what conditions stochastic post-processing can amplify the privacy of a mechanism. By interpreting post-processing as the application of a Markov operator, we first give a series of amplification results in terms of uniform mixing properties of the Markov process defined by said operator. Next we provide amplification bounds in terms of coupling arguments which can be applied in cases where uniform mixing is not available. Finally, we introduce a new family of mechanisms based on diffusion processes which are closed under post-processing, and analyze their privacy via a novel heat flow argument. On the applied side, we generalize the analysis of "privacy amplification by iteration" in Noisy SGD and show it admits an exponential improvement in the strongly convex case, and study a mechanism based on the Ornstein-Uhlenbeck diffusion process which contains the Gaussian mechanism with optimal post-processing on bounded inputs as a special case

    The Privacy Blanket of the Shuffle Model

    Full text link
    This work studies differential privacy in the context of the recently proposed shuffle model. Unlike in the local model, where the server collecting privatized data from users can track back an input to a specific user, in the shuffle model users submit their privatized inputs to a server anonymously. This setup yields a trust model which sits in between the classical curator and local models for differential privacy. The shuffle model is the core idea in the Encode, Shuffle, Analyze (ESA) model introduced by Bittau et al. (SOPS 2017). Recent work by Cheu et al. (EUROCRYPT 2019) analyzes the differential privacy properties of the shuffle model and shows that in some cases shuffled protocols provide strictly better accuracy than local protocols. Additionally, Erlingsson et al. (SODA 2019) provide a privacy amplification bound quantifying the level of curator differential privacy achieved by the shuffle model in terms of the local differential privacy of the randomizer used by each user. In this context, we make three contributions. First, we provide an optimal single message protocol for summation of real numbers in the shuffle model. Our protocol is very simple and has better accuracy and communication than the protocols for this same problem proposed by Cheu et al. Optimality of this protocol follows from our second contribution, a new lower bound for the accuracy of private protocols for summation of real numbers in the shuffle model. The third contribution is a new amplification bound for analyzing the privacy of protocols in the shuffle model in terms of the privacy provided by the corresponding local randomizer. Our amplification bound generalizes the results by Erlingsson et al. to a wider range of parameters, and provides a whole family of methods to analyze privacy amplification in the shuffle model

    Privacy-preserving Active Learning on Sensitive Data for User Intent Classification

    Full text link
    Active learning holds promise of significantly reducing data annotation costs while maintaining reasonable model performance. However, it requires sending data to annotators for labeling. This presents a possible privacy leak when the training set includes sensitive user data. In this paper, we describe an approach for carrying out privacy preserving active learning with quantifiable guarantees. We evaluate our approach by showing the tradeoff between privacy, utility and annotation budget on a binary classification task in a active learning setting.Comment: To appear at PAL: Privacy-Enhancing Artificial Intelligence and Language Technologies as part of the AAAI Spring Symposium Series (AAAI-SSS 2019
    • …
    corecore