3,503 research outputs found
A bayesian approach for on-line max and min auditing
In this paper we consider the on-line max and min query auditing problem: given a private association between fields in a data set, a sequence of max and min queries that have already been posed about the data, their corresponding answers and a new query, deny the answer if a private information is inferred or give the true answer otherwise. We give a probabilistic definition of privacy and demonstrate that max and min queries, without âno duplicatesâassumption, can be audited by means of a Bayesian network. Moreover, we show how our auditing approach is able to manage user prior-knowledge
Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns
As machine learning is increasingly used to make real-world decisions, recent
research efforts aim to define and ensure fairness in algorithmic decision
making. Existing methods often assume a fixed set of observable features to
define individuals, but lack a discussion of certain features not being
observed at test time. In this paper, we study fairness of naive Bayes
classifiers, which allow partial observations. In particular, we introduce the
notion of a discrimination pattern, which refers to an individual receiving
different classifications depending on whether some sensitive attributes were
observed. Then a model is considered fair if it has no such pattern. We propose
an algorithm to discover and mine for discrimination patterns in a naive Bayes
classifier, and show how to learn maximum likelihood parameters subject to
these fairness constraints. Our approach iteratively discovers and eliminates
discrimination patterns until a fair model is learned. An empirical evaluation
on three real-world datasets demonstrates that we can remove exponentially many
discrimination patterns by only adding a small fraction of them as constraints
XRay: Enhancing the Web's Transparency with Differential Correlation
Today's Web services - such as Google, Amazon, and Facebook - leverage user
data for varied purposes, including personalizing recommendations, targeting
advertisements, and adjusting prices. At present, users have little insight
into how their data is being used. Hence, they cannot make informed choices
about the services they choose. To increase transparency, we developed XRay,
the first fine-grained, robust, and scalable personal data tracking system for
the Web. XRay predicts which data in an arbitrary Web account (such as emails,
searches, or viewed products) is being used to target which outputs (such as
ads, recommended products, or prices). XRay's core functions are service
agnostic and easy to instantiate for new services, and they can track data
within and across services. To make predictions independent of the audited
service, XRay relies on the following insight: by comparing outputs from
different accounts with similar, but not identical, subsets of data, one can
pinpoint targeting through correlation. We show both theoretically, and through
experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision
and recall by correlating data from a surprisingly small number of extra
accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security
Symposium (USENIX Security 14
Costly risk verification without commitment in competitive
Cet article analyse l'Ă©quilibre d'un marchĂ© d'assurances oĂč les individus qui souscrivent une police d'assurance ont une obligation de bonne foi lorsqu'ils rĂ©vĂšlent une information privĂ©e sur leur risque. Les assureurs peuvent, Ă un certain coĂ»t, vĂ©rifier le type des assurĂ©s qui prĂ©sentent une demande d'indemnitĂ© et ils sont autorisĂ©s Ă annuler rĂ©troactivement le contrat d'assurance s'il est Ă©tabli que l'assurĂ© avait prĂ©sentĂ© son risque de maniĂšre incorrecte lorsqu'il avait souscrit la police d'assurance. Toutefois les assureurs ne peuvent s'engager sur leur stratĂ©gie de vĂ©rification du risque. L'article analyse la relation entre l'optimalitĂ© de Pareto de second rang et l'Ă©quilibre concurrentiel du marchĂ© de l'assurance dans un cadre de thĂ©orie des jeux. Il caractĂ©rise les contrats offerts Ă l'Ă©quilibre, les choix de contrat par les individus ainsi que les conditions d'existence de l'Ă©quilibre.
A Bayesian partial identification approach to inferring the prevalence of accounting misconduct
This paper describes the use of flexible Bayesian regression models for
estimating a partially identified probability function. Our approach permits
efficient sensitivity analysis concerning the posterior impact of priors on the
partially identified component of the regression model. The new methodology is
illustrated on an important problem where only partially observed data is
available - inferring the prevalence of accounting misconduct among publicly
traded U.S. businesses
Sensitivity analysis in multilinear probabilistic models
Sensitivity methods for the analysis of the outputs of discrete Bayesian networks have been extensively studied and implemented in different software packages. These methods usually focus on the study of sensitivity functions and on the impact of a parameter change to the ChanâDarwiche distance. Although not fully recognized, the majority of these results rely heavily on the multilinear structure of atomic probabilities in terms of the conditional probability parameters associated with this type of network. By defining a statistical model through the polynomial expression of its associated defining conditional probabilities, we develop here a unifying approach to sensitivity methods applicable to a large suite of models including extensions of Bayesian networks, for instance context-specific ones. Our algebraic approach enables us to prove that for models whose defining polynomial is multilinear both the ChanâDarwiche distance and any divergence in the family of Ï-divergences are minimized for a certain class of multi-parameter contemporaneous variations when parameters are proportionally covaried
- âŠ