31,525 research outputs found
On the Measurement of Privacy as an Attacker's Estimation Error
A wide variety of privacy metrics have been proposed in the literature to
evaluate the level of protection offered by privacy enhancing-technologies.
Most of these metrics are specific to concrete systems and adversarial models,
and are difficult to generalize or translate to other contexts. Furthermore, a
better understanding of the relationships between the different privacy metrics
is needed to enable more grounded and systematic approach to measuring privacy,
as well as to assist systems designers in selecting the most appropriate metric
for a given application.
In this work we propose a theoretical framework for privacy-preserving
systems, endowed with a general definition of privacy in terms of the
estimation error incurred by an attacker who aims to disclose the private
information that the system is designed to conceal. We show that our framework
permits interpreting and comparing a number of well-known metrics under a
common perspective. The arguments behind these interpretations are based on
fundamental results related to the theories of information, probability and
Bayes decision.Comment: This paper has 18 pages and 17 figure
Recommended from our members
On defining partition entropy by inequalities
Partition entropy is the numerical metric of uncertainty within
a partition of a finite set, while conditional entropy measures the degree of
difficulty in predicting a decision partition when a condition partition is
provided. Since two direct methods exist for defining conditional entropy
based on its partition entropy, the inequality postulates of monotonicity,
which conditional entropy satisfies, are actually additional constraints on
its entropy. Thus, in this paper partition entropy is defined as a function
of probability distribution, satisfying all the inequalities of not only partition
entropy itself but also its conditional counterpart. These inequality
postulates formalize the intuitive understandings of uncertainty contained
in partitions of finite sets.We study the relationships between these inequalities,
and reduce the redundancies among them. According to two different
definitions of conditional entropy from its partition entropy, the convenient
and unified checking conditions for any partition entropy are presented, respectively.
These properties generalize and illuminate the common nature
of all partition entropies
A Utility-Theoretic Approach to Privacy in Online Services
Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users’ preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples’ willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users
Understanding a Version of Multivariate Symmetric Uncertainty to assist in Feature Selection
In this paper, we analyze the behavior of the multivariate symmetric
uncertainty (MSU) measure through the use of statistical simulation techniques
under various mixes of informative and non-informative randomly generated
features. Experiments show how the number of attributes, their cardinalities,
and the sample size affect the MSU. We discovered a condition that preserves
good quality in the MSU under different combinations of these three factors,
providing a new useful criterion to help drive the process of dimension
reduction
A survey of cost-sensitive decision tree induction algorithms
The past decade has seen a significant interest on the problem of inducing decision trees that take account of costs of misclassification and costs of acquiring the features used for decision making. This survey identifies over 50 algorithms including approaches that are direct adaptations of accuracy based methods, use genetic algorithms, use anytime methods and utilize boosting and bagging. The survey brings together these different studies and novel approaches to cost-sensitive decision tree learning, provides a useful taxonomy, a historical timeline of how the field has developed and should provide a useful reference point for future research in this field
Determinants of Long-term Economic Development: An Empirical Cross-country Study Involving Rough Sets Theory and Rule Induction
Empirical findings on determinants of long-term economic growth are numerous, sometimes inconsistent, highly exciting and still incomplete. The empirical analysis was almost exclusively carried out by standard econometrics. This study compares results gained by cross-country regressions as reported in the literature with those gained by the rough sets theory and rule induction. The main advantages of using rough sets are being able to classify classes and to discretize. Thus, we do not have to deal with distributional, independence, (log-)linearity, and many other assumptions, but can keep the data as they are. The main difference between regression results and rough sets is that most education and human capital indicators can be labeled as robust attributes. In addition, we find that political indicators enter in a non-linear fashion with respect to growth.Economic growth, Rough sets, Rule induction
- …