5 research outputs found
Optimal Noise-Adding Mechanism in Additive Differential Privacy
We derive the optimal -differentially private query-output
independent noise-adding mechanism for single real-valued query function under
a general cost-minimization framework. Under a mild technical condition, we
show that the optimal noise probability distribution is a uniform distribution
with a probability mass at the origin. We explicitly derive the optimal noise
distribution for general cost functions, including (for noise
magnitude) and (for noise power) cost functions, and show that the
probability concentration on the origin occurs when .
Our result demonstrates an improvement over the existing Gaussian mechanisms by
a factor of two and three for -differential privacy in the high
privacy regime in the context of minimizing the noise magnitude and noise
power, and the gain is more pronounced in the low privacy regime. Our result is
consistent with the existing result for -differential privacy in
the discrete setting, and identifies a probability concentration phenomenon in
the continuous setting.Comment: 10 pages, 5 figures. Accepted by the 22nd International Conference on
Artificial Intelligence and Statistics (AISTATS 2019
Privacy and Utility Tradeoff in Approximate Differential Privacy
We characterize the minimum noise amplitude and power for noise-adding
mechanisms in -differential privacy for single real-valued
query function. We derive new lower bounds using the duality of linear
programming, and new upper bounds by proposing a new class of
-differentially private mechanisms, the \emph{truncated
Laplacian} mechanisms. We show that the multiplicative gap of the lower bounds
and upper bounds goes to zero in various high privacy regimes, proving the
tightness of the lower and upper bounds and thus establishing the optimality of
the truncated Laplacian mechanism. In particular, our results close the
previous constant multiplicative gap in the discrete setting. Numeric
experiments show the improvement of the truncated Laplacian mechanism over the
optimal Gaussian mechanism in all privacy regimes.Comment: 15 pages, 3 figure
Privacy-preserving Channel Estimation in Cell-free Hybrid Massive MIMO Systems
We consider a cell-free hybrid massive multiple-input multiple-output (MIMO)
system with users and access points (APs), each with antennas and
radio frequency (RF) chains. When , efficient uplink
channel estimation and data detection with reduced number of pilots can be
performed based on low-rank matrix completion. However, such a scheme requires
the central processing unit (CPU) to collect received signals from all APs,
which may enable the CPU to infer the private information of user locations. We
therefore develop and analyze privacy-preserving channel estimation schemes
under the framework of differential privacy (DP). As the key ingredient of the
channel estimator, two joint differentially private noisy matrix completion
algorithms based respectively on Frank-Wolfe iteration and singular value
decomposition are presented. We provide an analysis on the tradeoff between the
privacy and the channel estimation error. In particular, we show that the
estimation error can be mitigated while maintaining the same privacy level by
increasing the payload size with fixed pilot size; and the scaling laws of both
the privacy-induced and privacy-independent error components in terms of
payload size are characterized. Simulation results are provided to further
demonstrate the tradeoff between privacy and channel estimation performance.Comment: 30pages, 10figure
Training Production Language Models without Memorizing User Data
This paper presents the first consumer-scale next-word prediction (NWP) model
trained with Federated Learning (FL) while leveraging the Differentially
Private Federated Averaging (DP-FedAvg) technique. There has been prior work on
building practical FL infrastructure, including work demonstrating the
feasibility of training language models on mobile devices using such
infrastructure. It has also been shown (in simulations on a public corpus) that
it is possible to train NWP models with user-level differential privacy using
the DP-FedAvg algorithm. Nevertheless, training production-quality NWP models
with DP-FedAvg in a real-world production environment on a heterogeneous fleet
of mobile phones requires addressing numerous challenges. For instance, the
coordinating central server has to keep track of the devices available at the
start of each round and sample devices uniformly at random from them, while
ensuring \emph{secrecy of the sample}, etc. Unlike all prior privacy-focused FL
work of which we are aware, for the first time we demonstrate the deployment of
a differentially private mechanism for the training of a production neural
network in FL, as well as the instrumentation of the production training
infrastructure to perform an end-to-end empirical measurement of unintended
memorization
Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method
We consider the classical setting of optimizing a nonsmooth Lipschitz
continuous convex function over a convex constraint set, when having access to
a (stochastic) first-order oracle (FO) for the function and a projection oracle
(PO) for the constraint set. It is well known that to achieve
-suboptimality in high-dimensions, FO calls
are necessary. This is achieved by the projected subgradient method (PGD).
However, PGD also entails PO calls, which may be
computationally costlier than FO calls (e.g. nuclear norm constraints).
Improving this PO calls complexity of PGD is largely unexplored, despite the
fundamental nature of this problem and extensive literature. We present first
such improvement. This only requires a mild assumption that the objective
function, when extended to a slightly larger neighborhood of the constraint
set, still remains Lipschitz and accessible via FO. In particular, we introduce
MOPES method, which carefully combines Moreau-Yosida smoothing and accelerated
first-order schemes. This is guaranteed to find a feasible
-suboptimal solution using only PO calls and
optimal FO calls. Further, instead of a PO if we only have a
linear minimization oracle (LMO, a la Frank-Wolfe) to access the constraint
set, an extension of our method, MOLES, finds a feasible -suboptimal
solution using LMO calls and FO calls---both match known
lower bounds, resolving a question left open since White (1993). Our
experiments confirm that these methods achieve significant speedups over the
state-of-the-art, for a problem with costly PO and LMO calls