3,503 research outputs found

    A bayesian approach for on-line max and min auditing

    Get PDF
    In this paper we consider the on-line max and min query auditing problem: given a private association between fields in a data set, a sequence of max and min queries that have already been posed about the data, their corresponding answers and a new query, deny the answer if a private information is inferred or give the true answer otherwise. We give a probabilistic definition of privacy and demonstrate that max and min queries, without “no duplicates”assumption, can be audited by means of a Bayesian network. Moreover, we show how our auditing approach is able to manage user prior-knowledge

    Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns

    Full text link
    As machine learning is increasingly used to make real-world decisions, recent research efforts aim to define and ensure fairness in algorithmic decision making. Existing methods often assume a fixed set of observable features to define individuals, but lack a discussion of certain features not being observed at test time. In this paper, we study fairness of naive Bayes classifiers, which allow partial observations. In particular, we introduce the notion of a discrimination pattern, which refers to an individual receiving different classifications depending on whether some sensitive attributes were observed. Then a model is considered fair if it has no such pattern. We propose an algorithm to discover and mine for discrimination patterns in a naive Bayes classifier, and show how to learn maximum likelihood parameters subject to these fairness constraints. Our approach iteratively discovers and eliminates discrimination patterns until a fair model is learned. An empirical evaluation on three real-world datasets demonstrates that we can remove exponentially many discrimination patterns by only adding a small fraction of them as constraints

    XRay: Enhancing the Web's Transparency with Differential Correlation

    Get PDF
    Today's Web services - such as Google, Amazon, and Facebook - leverage user data for varied purposes, including personalizing recommendations, targeting advertisements, and adjusting prices. At present, users have little insight into how their data is being used. Hence, they cannot make informed choices about the services they choose. To increase transparency, we developed XRay, the first fine-grained, robust, and scalable personal data tracking system for the Web. XRay predicts which data in an arbitrary Web account (such as emails, searches, or viewed products) is being used to target which outputs (such as ads, recommended products, or prices). XRay's core functions are service agnostic and easy to instantiate for new services, and they can track data within and across services. To make predictions independent of the audited service, XRay relies on the following insight: by comparing outputs from different accounts with similar, but not identical, subsets of data, one can pinpoint targeting through correlation. We show both theoretically, and through experiments on Gmail, Amazon, and YouTube, that XRay achieves high precision and recall by correlating data from a surprisingly small number of extra accounts.Comment: Extended version of a paper presented at the 23rd USENIX Security Symposium (USENIX Security 14

    Costly risk verification without commitment in competitive

    Get PDF
    Cet article analyse l'Ă©quilibre d'un marchĂ© d'assurances oĂč les individus qui souscrivent une police d'assurance ont une obligation de bonne foi lorsqu'ils rĂ©vĂšlent une information privĂ©e sur leur risque. Les assureurs peuvent, Ă  un certain coĂ»t, vĂ©rifier le type des assurĂ©s qui prĂ©sentent une demande d'indemnitĂ© et ils sont autorisĂ©s Ă  annuler rĂ©troactivement le contrat d'assurance s'il est Ă©tabli que l'assurĂ© avait prĂ©sentĂ© son risque de maniĂšre incorrecte lorsqu'il avait souscrit la police d'assurance. Toutefois les assureurs ne peuvent s'engager sur leur stratĂ©gie de vĂ©rification du risque. L'article analyse la relation entre l'optimalitĂ© de Pareto de second rang et l'Ă©quilibre concurrentiel du marchĂ© de l'assurance dans un cadre de thĂ©orie des jeux. Il caractĂ©rise les contrats offerts Ă  l'Ă©quilibre, les choix de contrat par les individus ainsi que les conditions d'existence de l'Ă©quilibre.

    A Bayesian partial identification approach to inferring the prevalence of accounting misconduct

    Get PDF
    This paper describes the use of flexible Bayesian regression models for estimating a partially identified probability function. Our approach permits efficient sensitivity analysis concerning the posterior impact of priors on the partially identified component of the regression model. The new methodology is illustrated on an important problem where only partially observed data is available - inferring the prevalence of accounting misconduct among publicly traded U.S. businesses

    Sensitivity analysis in multilinear probabilistic models

    Get PDF
    Sensitivity methods for the analysis of the outputs of discrete Bayesian networks have been extensively studied and implemented in different software packages. These methods usually focus on the study of sensitivity functions and on the impact of a parameter change to the Chan–Darwiche distance. Although not fully recognized, the majority of these results rely heavily on the multilinear structure of atomic probabilities in terms of the conditional probability parameters associated with this type of network. By defining a statistical model through the polynomial expression of its associated defining conditional probabilities, we develop here a unifying approach to sensitivity methods applicable to a large suite of models including extensions of Bayesian networks, for instance context-specific ones. Our algebraic approach enables us to prove that for models whose defining polynomial is multilinear both the Chan–Darwiche distance and any divergence in the family of ϕ-divergences are minimized for a certain class of multi-parameter contemporaneous variations when parameters are proportionally covaried
