117 research outputs found
Selling Privacy at Auction
We initiate the study of markets for private data, though the lens of
differential privacy. Although the purchase and sale of private data has
already begun on a large scale, a theory of privacy as a commodity is missing.
In this paper, we propose to build such a theory. Specifically, we consider a
setting in which a data analyst wishes to buy information from a population
from which he can estimate some statistic. The analyst wishes to obtain an
accurate estimate cheaply. On the other hand, the owners of the private data
experience some cost for their loss of privacy, and must be compensated for
this loss. Agents are selfish, and wish to maximize their profit, so our goal
is to design truthful mechanisms. Our main result is that such auctions can
naturally be viewed and optimally solved as variants of multi-unit procurement
auctions. Based on this result, we derive auctions for two natural settings
which are optimal up to small constant factors:
1. In the setting in which the data analyst has a fixed accuracy goal, we
show that an application of the classic Vickrey auction achieves the analyst's
accuracy goal while minimizing his total payment.
2. In the setting in which the data analyst has a fixed budget, we give a
mechanism which maximizes the accuracy of the resulting estimate while
guaranteeing that the resulting sum payments do not exceed the analysts budget.
In both cases, our comparison class is the set of envy-free mechanisms, which
correspond to the natural class of fixed-price mechanisms in our setting.
In both of these results, we ignore the privacy cost due to possible
correlations between an individuals private data and his valuation for privacy
itself. We then show that generically, no individually rational mechanism can
compensate individuals for the privacy loss incurred due to their reported
valuations for privacy.Comment: Extended Abstract appeared in the proceedings of EC 201
Selling privacy at auction. In:
ABSTRACT We initiate the study of markets for private data, through the lens of differential privacy. Although the purchase and sale of private data has already begun on a large scale, a theory of privacy as a commodity is missing. In this paper, we propose to build such a theory. Specifically, we consider a setting in which a data analyst wishes to buy information from a population from which he can estimate some statistic. The analyst wishes to obtain an accurate estimate cheaply, while the owners of the private data experience some cost for their loss of privacy, and must be compensated for this loss. Agents are selfish, and wish to maximize their profit, so our goal is to design truthful mechanisms. Our main result is that such problems can naturally be viewed and optimally solved as variants of multi-unit procurement auctions. Based on this result, we derive auctions which are optimal up to small constant factors for two natural settings: 1. When the data analyst has a fixed accuracy goal, we show that an application of the classic Vickrey auction achieves the analyst's accuracy goal while minimizing his total payment. 2. When the data analyst has a fixed budget, we give a mechanism which maximizes the accuracy of the resulting estimate while guaranteeing that the resulting sum payments do not exceed the analyst's budget. In both cases, our comparison class is the set of envy-free mechanisms, which correspond to the natural class of fixed-price mechanisms in our setting. In both of these results, we ignore the privacy cost due to possible correlations between an individual's private data and his valuation for privacy itself. We then show that generically, no individually rational mechanism can compensate individuals for the privacy loss incurred due to their reported valuations for privacy. This is nevertheless an important issue, and modeling it correctly is one of the many exciting directions for future work
Why the Economics Profession Must Actively Participate in the Privacy Protection Debate
When Google or the U.S. Census Bureau publish detailed statistics on browsing habits or neighborhood characteristics, some privacy is lost for everybody while supplying public information. To date, economists have not focused on the privacy loss inherent in data publication. In their stead, these issues have been advanced almost exclusively by computer scientists who are primarily interested in technical problems associated with protecting privacy. Economists should join the discussion, first, to determine where to balance privacy protection against data quality; a social choice problem. Furthermore, economists must ensure new privacy models preserve the validity of public data for economic research
Implementasi Algoritma Merkle-Hellman Knapsack dalam Penyandian Record Database
A lot of data is misused without the data owner being aware of it. Software developers must ensure the security user data on their system. Due to the size of the market that houses data, the security of record databases must be of great concern. Cryptographic systems or data encryption can be used for data security. The Merkle-Hellman Knapsack algorithm is included in public-key cryptography because it uses different keys for the encryption and decryption processes. This algorithm belongs to the NP-complete algorithm which cannot be solved in polynomial order time. This algorithm has stages of key generation, encryption, and decryption. The results of this study secure database records from theft by storing records in the form of ciphertext/password. Ciphertext generated by algorithmic encryption has a larger size than plaintext
Linear Regression from Strategic Data Sources
Linear regression is a fundamental building block of statistical data
analysis. It amounts to estimating the parameters of a linear model that maps
input features to corresponding outputs. In the classical setting where the
precision of each data point is fixed, the famous Aitken/Gauss-Markov theorem
in statistics states that generalized least squares (GLS) is a so-called "Best
Linear Unbiased Estimator" (BLUE). In modern data science, however, one often
faces strategic data sources, namely, individuals who incur a cost for
providing high-precision data.
In this paper, we study a setting in which features are public but
individuals choose the precision of the outputs they reveal to an analyst. We
assume that the analyst performs linear regression on this dataset, and
individuals benefit from the outcome of this estimation. We model this scenario
as a game where individuals minimize a cost comprising two components: (a) an
(agent-specific) disclosure cost for providing high-precision data; and (b) a
(global) estimation cost representing the inaccuracy in the linear model
estimate. In this game, the linear model estimate is a public good that
benefits all individuals. We establish that this game has a unique non-trivial
Nash equilibrium. We study the efficiency of this equilibrium and we prove
tight bounds on the price of stability for a large class of disclosure and
estimation costs. Finally, we study the estimator accuracy achieved at
equilibrium. We show that, in general, Aitken's theorem does not hold under
strategic data sources, though it does hold if individuals have identical
disclosure costs (up to a multiplicative factor). When individuals have
non-identical costs, we derive a bound on the improvement of the equilibrium
estimation cost that can be achieved by deviating from GLS, under mild
assumptions on the disclosure cost functions.Comment: This version (v3) extends the results on the sub-optimality of GLS
(Section 6) and improves writing in multiple places compared to v2. Compared
to the initial version v1, it also fixes an error in Theorem 6 (now Theorem
5), and extended many of the result
Conducting Truthful Surveys, Cheaply
We consider the problem of conducting a survey with the goal of obtaining an
unbiased estimator of some population statistic when individuals have unknown
costs (drawn from a known prior) for participating in the survey. Individuals
must be compensated for their participation and are strategic agents, and so
the payment scheme must incentivize truthful behavior. We derive optimal
truthful mechanisms for this problem for the two goals of minimizing the
variance of the estimator given a fixed budget, and minimizing the expected
cost of the survey given a fixed variance goal
The Empirical Implications of Privacy-Aware Choice
This paper initiates the study of the testable implications of choice data in
settings where agents have privacy preferences. We adapt the standard
conceptualization of consumer choice theory to a situation where the consumer
is aware of, and has preferences over, the information revealed by her choices.
The main message of the paper is that little can be inferred about consumers'
preferences once we introduce the possibility that the consumer has concerns
about privacy. This holds even when consumers' privacy preferences are assumed
to be monotonic and separable. This motivates the consideration of stronger
assumptions and, to that end, we introduce an additive model for privacy
preferences that does have testable implications
- …