5 research outputs found

    Optimal Noise-Adding Mechanism in Additive Differential Privacy

    Full text link
    We derive the optimal (0,δ)(0, \delta)-differentially private query-output independent noise-adding mechanism for single real-valued query function under a general cost-minimization framework. Under a mild technical condition, we show that the optimal noise probability distribution is a uniform distribution with a probability mass at the origin. We explicitly derive the optimal noise distribution for general ℓp\ell^p cost functions, including ℓ1\ell^1 (for noise magnitude) and ℓ2\ell^2 (for noise power) cost functions, and show that the probability concentration on the origin occurs when δ>pp+1\delta > \frac{p}{p+1}. Our result demonstrates an improvement over the existing Gaussian mechanisms by a factor of two and three for (0,δ)(0,\delta)-differential privacy in the high privacy regime in the context of minimizing the noise magnitude and noise power, and the gain is more pronounced in the low privacy regime. Our result is consistent with the existing result for (0,δ)(0,\delta)-differential privacy in the discrete setting, and identifies a probability concentration phenomenon in the continuous setting.Comment: 10 pages, 5 figures. Accepted by the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019

    Privacy and Utility Tradeoff in Approximate Differential Privacy

    Full text link
    We characterize the minimum noise amplitude and power for noise-adding mechanisms in (ϵ,δ)(\epsilon, \delta)-differential privacy for single real-valued query function. We derive new lower bounds using the duality of linear programming, and new upper bounds by proposing a new class of (ϵ,δ)(\epsilon,\delta)-differentially private mechanisms, the \emph{truncated Laplacian} mechanisms. We show that the multiplicative gap of the lower bounds and upper bounds goes to zero in various high privacy regimes, proving the tightness of the lower and upper bounds and thus establishing the optimality of the truncated Laplacian mechanism. In particular, our results close the previous constant multiplicative gap in the discrete setting. Numeric experiments show the improvement of the truncated Laplacian mechanism over the optimal Gaussian mechanism in all privacy regimes.Comment: 15 pages, 3 figure

    Privacy-preserving Channel Estimation in Cell-free Hybrid Massive MIMO Systems

    Full text link
    We consider a cell-free hybrid massive multiple-input multiple-output (MIMO) system with KK users and MM access points (APs), each with NaN_a antennas and Nr<NaN_r< N_a radio frequency (RF) chains. When K≪MNaK\ll M{N_a}, efficient uplink channel estimation and data detection with reduced number of pilots can be performed based on low-rank matrix completion. However, such a scheme requires the central processing unit (CPU) to collect received signals from all APs, which may enable the CPU to infer the private information of user locations. We therefore develop and analyze privacy-preserving channel estimation schemes under the framework of differential privacy (DP). As the key ingredient of the channel estimator, two joint differentially private noisy matrix completion algorithms based respectively on Frank-Wolfe iteration and singular value decomposition are presented. We provide an analysis on the tradeoff between the privacy and the channel estimation error. In particular, we show that the estimation error can be mitigated while maintaining the same privacy level by increasing the payload size with fixed pilot size; and the scaling laws of both the privacy-induced and privacy-independent error components in terms of payload size are characterized. Simulation results are provided to further demonstrate the tradeoff between privacy and channel estimation performance.Comment: 30pages, 10figure

    Training Production Language Models without Memorizing User Data

    Full text link
    This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique. There has been prior work on building practical FL infrastructure, including work demonstrating the feasibility of training language models on mobile devices using such infrastructure. It has also been shown (in simulations on a public corpus) that it is possible to train NWP models with user-level differential privacy using the DP-FedAvg algorithm. Nevertheless, training production-quality NWP models with DP-FedAvg in a real-world production environment on a heterogeneous fleet of mobile phones requires addressing numerous challenges. For instance, the coordinating central server has to keep track of the devices available at the start of each round and sample devices uniformly at random from them, while ensuring \emph{secrecy of the sample}, etc. Unlike all prior privacy-focused FL work of which we are aware, for the first time we demonstrate the deployment of a differentially private mechanism for the training of a production neural network in FL, as well as the instrumentation of the production training infrastructure to perform an end-to-end empirical measurement of unintended memorization

    Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method

    Full text link
    We consider the classical setting of optimizing a nonsmooth Lipschitz continuous convex function over a convex constraint set, when having access to a (stochastic) first-order oracle (FO) for the function and a projection oracle (PO) for the constraint set. It is well known that to achieve ϵ\epsilon-suboptimality in high-dimensions, Θ(ϵ−2)\Theta(\epsilon^{-2}) FO calls are necessary. This is achieved by the projected subgradient method (PGD). However, PGD also entails O(ϵ−2)O(\epsilon^{-2}) PO calls, which may be computationally costlier than FO calls (e.g. nuclear norm constraints). Improving this PO calls complexity of PGD is largely unexplored, despite the fundamental nature of this problem and extensive literature. We present first such improvement. This only requires a mild assumption that the objective function, when extended to a slightly larger neighborhood of the constraint set, still remains Lipschitz and accessible via FO. In particular, we introduce MOPES method, which carefully combines Moreau-Yosida smoothing and accelerated first-order schemes. This is guaranteed to find a feasible ϵ\epsilon-suboptimal solution using only O(ϵ−1)O(\epsilon^{-1}) PO calls and optimal O(ϵ−2)O(\epsilon^{-2}) FO calls. Further, instead of a PO if we only have a linear minimization oracle (LMO, a la Frank-Wolfe) to access the constraint set, an extension of our method, MOLES, finds a feasible ϵ\epsilon-suboptimal solution using O(ϵ−2)O(\epsilon^{-2}) LMO calls and FO calls---both match known lower bounds, resolving a question left open since White (1993). Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls
    corecore