5,229 research outputs found
XYZ Privacy
Future autonomous vehicles will generate, collect, aggregate and consume
significant volumes of data as key gateway devices in emerging Internet of
Things scenarios. While vehicles are widely accepted as one of the most
challenging mobility contexts in which to achieve effective data
communications, less attention has been paid to the privacy of data emerging
from these vehicles. The quality and usability of such privatized data will lie
at the heart of future safe and efficient transportation solutions.
In this paper, we present the XYZ Privacy mechanism. XYZ Privacy is to our
knowledge the first such mechanism that enables data creators to submit
multiple contradictory responses to a query, whilst preserving utility measured
as the absolute error from the actual original data. The functionalities are
achieved in both a scalable and secure fashion. For instance, individual
location data can be obfuscated while preserving utility, thereby enabling the
scheme to transparently integrate with existing systems (e.g. Waze). A new
cryptographic primitive Function Secret Sharing is used to achieve
non-attributable writes and we show an order of magnitude improvement from the
default implementation.Comment: arXiv admin note: text overlap with arXiv:1708.0188
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy
Location Privacy in Spatial Crowdsourcing
Spatial crowdsourcing (SC) is a new platform that engages individuals in
collecting and analyzing environmental, social and other spatiotemporal
information. With SC, requesters outsource their spatiotemporal tasks to a set
of workers, who will perform the tasks by physically traveling to the tasks'
locations. This chapter identifies privacy threats toward both workers and
requesters during the two main phases of spatial crowdsourcing, tasking and
reporting. Tasking is the process of identifying which tasks should be assigned
to which workers. This process is handled by a spatial crowdsourcing server
(SC-server). The latter phase is reporting, in which workers travel to the
tasks' locations, complete the tasks and upload their reports to the SC-server.
The challenge is to enable effective and efficient tasking as well as reporting
in SC without disclosing the actual locations of workers (at least until they
agree to perform a task) and the tasks themselves (at least to workers who are
not assigned to those tasks). This chapter aims to provide an overview of the
state-of-the-art in protecting users' location privacy in spatial
crowdsourcing. We provide a comparative study of a diverse set of solutions in
terms of task publishing modes (push vs. pull), problem focuses (tasking and
reporting), threats (server, requester and worker), and underlying technical
approaches (from pseudonymity, cloaking, and perturbation to exchange-based and
encryption-based techniques). The strengths and drawbacks of the techniques are
highlighted, leading to a discussion of open problems and future work
Recommended from our members
Privacy-preserving model learning on a blockchain network-of-networks.
ObjectiveTo facilitate clinical/genomic/biomedical research, constructing generalizable predictive models using cross-institutional methods while protecting privacy is imperative. However, state-of-the-art methods assume a "flattened" topology, while real-world research networks may consist of "network-of-networks" which can imply practical issues including training on small data for rare diseases/conditions, prioritizing locally trained models, and maintaining models for each level of the hierarchy. In this study, we focus on developing a hierarchical approach to inherit the benefits of the privacy-preserving methods, retain the advantages of adopting blockchain, and address practical concerns on a research network-of-networks.Materials and methodsWe propose a framework to combine level-wise model learning, blockchain-based model dissemination, and a novel hierarchical consensus algorithm for model ensemble. We developed an example implementation HierarchicalChain (hierarchical privacy-preserving modeling on blockchain), evaluated it on 3 healthcare/genomic datasets, as well as compared its predictive correctness, learning iteration, and execution time with a state-of-the-art method designed for flattened network topology.ResultsHierarchicalChain improves the predictive correctness for small training datasets and provides comparable correctness results with the competing method with higher learning iteration and similar per-iteration execution time, inherits the benefits of the privacy-preserving learning and advantages of blockchain technology, and immutable records models for each level.DiscussionHierarchicalChain is independent of the core privacy-preserving learning method, as well as of the underlying blockchain platform. Further studies are warranted for various types of network topology, complex data, and privacy concerns.ConclusionWe demonstrated the potential of utilizing the information from the hierarchical network-of-networks topology to improve prediction
Privacy and Accountability in Black-Box Medicine
Black-box medicine—the use of big data and sophisticated machine learning techniques for health-care applications—could be the future of personalized medicine. Black-box medicine promises to make it easier to diagnose rare diseases and conditions, identify the most promising treatments, and allocate scarce resources among different patients. But to succeed, it must overcome two separate, but related, problems: patient privacy and algorithmic accountability. Privacy is a problem because researchers need access to huge amounts of patient health information to generate useful medical predictions. And accountability is a problem because black-box algorithms must be verified by outsiders to ensure they are accurate and unbiased, but this means giving outsiders access to this health information.
This article examines the tension between the twin goals of privacy and accountability and develops a framework for balancing that tension. It proposes three pillars for an effective system of privacy-preserving accountability: substantive limitations on the collection, use, and disclosure of patient information; independent gatekeepers regulating information sharing between those developing and verifying black-box algorithms; and information-security requirements to prevent unintentional disclosures of patient information. The article examines and draws on a similar debate in the field of clinical trials, where disclosing information from past trials can lead to new treatments but also threatens patient privacy
Revisiting the Economics of Privacy: Population Statistics and Confidentiality Protection as Public Goods
This paper has been replaced with http://digitalcommons.ilr.cornell.edu/ldi/37.
We consider the problem of the public release of statistical information about a population–explicitly accounting for the public-good properties of both data accuracy and privacy loss. We first consider the implications of adding the public-good component to recently published models of private data publication under differential privacy guarantees using a Vickery-Clark-Groves mechanism and a Lindahl mechanism. We show that data quality will be inefficiently under-supplied. Next, we develop a standard social planner’s problem using the technology set implied by (ε, δ)-differential privacy with (α, β)-accuracy for the Private Multiplicative Weights query release mechanism to study the properties of optimal provision of data accuracy and privacy loss when both are public goods. Using the production possibilities frontier implied by this technology, explicitly parameterized interdependent preferences, and the social welfare function, we display properties of the solution to the social planner’s problem. Our results directly quantify the optimal choice of data accuracy and privacy loss as functions of the technology and preference parameters. Some of these properties can be quantified using population statistics on marginal preferences and correlations between income, data accuracy preferences, and privacy loss preferences that are available from survey data. Our results show that government data custodians should publish more accurate statistics with weaker privacy guarantees than would occur with purely private data publishing. Our statistical results using the General Social Survey and the Cornell National Social Survey indicate that the welfare losses from under-providing data accuracy while over-providing privacy protection can be substantial
- …