920 research outputs found
A Framework for High-Accuracy Privacy-Preserving Mining
To preserve client privacy in the data mining process, a variety of
techniques based on random perturbation of data records have been proposed
recently. In this paper, we present a generalized matrix-theoretic model of
random perturbation, which facilitates a systematic approach to the design of
perturbation mechanisms for privacy-preserving mining. Specifically, we
demonstrate that (a) the prior techniques differ only in their settings for the
model parameters, and (b) through appropriate choice of parameter settings, we
can derive new perturbation techniques that provide highly accurate mining
results even under strict privacy guarantees. We also propose a novel
perturbation mechanism wherein the model parameters are themselves
characterized as random variables, and demonstrate that this feature provides
significant improvements in privacy at a very marginal cost in accuracy.
While our model is valid for random-perturbation-based privacy-preserving
mining in general, we specifically evaluate its utility here with regard to
frequent-itemset mining on a variety of real datasets. The experimental results
indicate that our mechanisms incur substantially lower identity and support
errors as compared to the prior techniques
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy
Securing Real-Time Internet-of-Things
Modern embedded and cyber-physical systems are ubiquitous. A large number of
critical cyber-physical systems have real-time requirements (e.g., avionics,
automobiles, power grids, manufacturing systems, industrial control systems,
etc.). Recent developments and new functionality requires real-time embedded
devices to be connected to the Internet. This gives rise to the real-time
Internet-of-things (RT-IoT) that promises a better user experience through
stronger connectivity and efficient use of next-generation embedded devices.
However RT- IoT are also increasingly becoming targets for cyber-attacks which
is exacerbated by this increased connectivity. This paper gives an introduction
to RT-IoT systems, an outlook of current approaches and possible research
challenges towards secure RT- IoT frameworks
Quantitative information flow under generic leakage functions and adaptive adversaries
We put forward a model of action-based randomization mechanisms to analyse
quantitative information flow (QIF) under generic leakage functions, and under
possibly adaptive adversaries. This model subsumes many of the QIF models
proposed so far. Our main contributions include the following: (1) we identify
mild general conditions on the leakage function under which it is possible to
derive general and significant results on adaptive QIF; (2) we contrast the
efficiency of adaptive and non-adaptive strategies, showing that the latter are
as efficient as the former in terms of length up to an expansion factor bounded
by the number of available actions; (3) we show that the maximum information
leakage over strategies, given a finite time horizon, can be expressed in terms
of a Bellman equation. This can be used to compute an optimal finite strategy
recursively, by resorting to standard methods like backward induction.Comment: Revised and extended version of conference paper with the same title
appeared in Proc. of FORTE 2014, LNC
Differentially Private Optimal Power Flow for Distribution Grids
Although distribution grid customers are obliged to share their consumption
data with distribution system operators (DSOs), a possible leakage of this data
is often disregarded in operational routines of DSOs. This paper introduces a
privacy-preserving optimal power flow (OPF) mechanism for distribution grids
that secures customer privacy from unauthorised access to OPF solutions, e.g.,
current and voltage measurements. The mechanism is based on the framework of
differential privacy that allows to control the participation risks of
individuals in a dataset by applying a carefully calibrated noise to the output
of a computation. Unlike existing private mechanisms, this mechanism does not
apply the noise to the optimization parameters or its result. Instead, it
optimizes OPF variables as affine functions of the random noise, which weakens
the correlation between the grid loads and OPF variables. To ensure feasibility
of the randomized OPF solution, the mechanism makes use of chance constraints
enforced on the grid limits. The mechanism is further extended to control the
optimality loss induced by the random noise, as well as the variance of OPF
variables. The paper shows that the differentially private OPF solution does
not leak customer loads up to specified parameters
Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods
This paper has been replaced with http://digitalcommons.ilr.cornell.edu/ldi/37.
We consider the problem of the public release of statistical information about a population–explicitly accounting for the public-good properties of both data accuracy and privacy loss. We first consider the implications of adding the public-good component to recently published models of private data publication under differential privacy guarantees using a Vickery-Clark-Groves mechanism and a Lindahl mechanism. We show that data quality will be inefficiently under-supplied. Next, we develop a standard social planner’s problem using the technology set implied by (ε, δ)-differential privacy with (α, β)-accuracy for the Private Multiplicative Weights query release mechanism to study the properties of optimal provision of data accuracy and privacy loss when both are public goods. Using the production possibilities frontier implied by this technology, explicitly parameterized interdependent preferences, and the social welfare function, we display properties of the solution to the social planner’s problem. Our results directly quantify the optimal choice of data accuracy and privacy loss as functions of the technology and preference parameters. Some of these properties can be quantified using population statistics on marginal preferences and correlations between income, data accuracy preferences, and privacy loss preferences that are available from survey data. Our results show that government data custodians should publish more accurate statistics with weaker privacy guarantees than would occur with purely private data publishing. Our statistical results using the General Social Survey and the Cornell National Social Survey indicate that the welfare losses from under-providing data accuracy while over-providing privacy protection can be substantial
- …