20 research outputs found

    Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals

    Get PDF
    Recent studies have shown that information disclosed on social network sites (such as Facebook) can be used to predict personal characteristics with surprisingly high accuracy. In this paper we examine a method to give online users transparency into why certain inferences are made about them by statistical models, and control to inhibit those inferences by hiding ("cloaking") certain personal information from inference. We use this method to examine whether such transparency and control would be a reasonable goal by assessing how difficult it would be for users to actually inhibit inferences. Applying the method to data from a large collection of real users on Facebook, we show that a user must cloak only a small portion of her Facebook Likes in order to inhibit inferences about their personal characteristics. However, we also show that in response a firm could change its modeling of users to make cloaking more difficult.Comment: presented at 2016 ICML Workshop on Human Interpretability in Machine Learning (WHI 2016), New York, N

    Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals

    Get PDF
    Recent studies show the remarkable power of information disclosed by users on social network sites to infer the users' personal characteristics via predictive modeling. In response, attention is turning increasingly to the transparency that sites provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. We draw on the evidence counterfactual as a means for providing transparency into why particular inferences are drawn about them. We then introduce the idea of a \cloaking device" as a vehicle to provide (and to study) control. Specifically, the cloaking device provides a mechanism for users to inhibit the use of particular pieces of information in inference; combined with the transparency provided by the evidence counterfactual a user can control model-driven inferences, while minimizing the amount of disruption to her normal activity. Using these analytical tools we ask two main questions: (1) How much information must users cloak in order to significantly affect inferences about their personal traits? We find that usually a user must cloak only a small portion of her actions in order to inhibit inference. We also find that, encouragingly, false positive inferences are significantly easier to cloak than true positive inferences. (2) Can firms change their modeling behavior to make cloaking more difficult? The answer is a definitive yes. In our main results we replicate the methodology of Kosinski et al. (2013) for modeling personal traits; then we demonstrate a simple modeling change that still gives accurate inferences of personal traits, but requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make modeling choices to make control easier or harder for their users.Columbia University, New York University, NYU Stern School of Business, NYU Center for Data Scienc

    Enhancing Transparency and Control when Drawing Data-Driven Inferences about Individuals

    Get PDF
    Abstract Recent studies show the remarkable power of information disclosed by users on social network sites to infer the users' personal characteristics via predictive modeling. In response, attention is turning increasingly to the transparency that sites provide to users as to what inferences are drawn and why, as well as to what sort of control users can be given over inferences that are drawn about them. We draw on the evidence counterfactual as a means for providing transparency into why particular inferences are drawn about them. We then introduce the idea of a "cloaking device" as a vehicle to provide (and to study) control. Specifically, the cloaking device provides a mechanism for users to inhibit the use of particular pieces of information in inference; combined with the transparency provided by the evidence counterfactual a user can control model-driven inferences, while minimizing the amount of disruption to her normal activity. Using these analytical tools we ask two main questions: (1) How much information must users cloak in order to significantly affect inferences about their personal traits? We find that usually a user must cloak only a small portion of her actions in order to inhibit inference. We also find that, encouragingly, false positive inferences are significantly easier to cloak than true positive inferences. gives accurate inferences of personal traits, but requires users to cloak substantially more information to affect the inferences drawn. The upshot is that organizations can provide transparency and control even into complicated, predictive model-driven inferences, but they also can make modeling choices to make control easier or harder for their users

    Counterfactual Explanations for Data-Driven Decisions

    Get PDF
    Users’ lack of understanding of systems that use predictive models to make automated decisions is one of the main barriers for their adoption. We adopt the increasingly accepted view of a counterfactual explanation for a system decision: a set of the system inputs that is causal (meaning that removing them changes the decision) and irreducible (meaning that removing any subset of the inputs in the explanation does not change the decision). We generalize previous work on counterfactual explanations in three ways: we explain system decisions rather than model predictions; we do not enforce any specific method for removing inputs, and our explanations can incorporate inputs with arbitrary data structures. We also show how model-agnostic algorithms can be tweaked to find the most useful explanations depending on the context. Finally, we showcase our approach using a real data set to illustrate its advantages over other explanation methods when the goal is to understand system decisions better

    Explaining Data-Driven Decisions made by AI Systems: The Counterfactual Approach

    Full text link
    We examine counterfactual explanations for explaining the decisions made by model-based AI systems. The counterfactual approach we consider defines an explanation as a set of the system's data inputs that causally drives the decision (i.e., changing the inputs in the set changes the decision) and is irreducible (i.e., changing any subset of the inputs does not change the decision). We (1) demonstrate how this framework may be used to provide explanations for decisions made by general, data-driven AI systems that may incorporate features with arbitrary data types and multiple predictive models, and (2) propose a heuristic procedure to find the most useful explanations depending on the context. We then contrast counterfactual explanations with methods that explain model predictions by weighting features according to their importance (e.g., SHAP, LIME) and present two fundamental reasons why we should carefully consider whether importance-weight explanations are well-suited to explain system decisions. Specifically, we show that (i) features that have a large importance weight for a model prediction may not affect the corresponding decision, and (ii) importance weights are insufficient to communicate whether and how features influence decisions. We demonstrate this with several concise examples and three detailed case studies that compare the counterfactual approach with SHAP to illustrate various conditions under which counterfactual explanations explain data-driven decisions better than importance weights

    Matching Code and Law: Achieving Algorithmic Fairness with Optimal Transport

    Full text link
    Increasingly, discrimination by algorithms is perceived as a societal and legal problem. As a response, a number of criteria for implementing algorithmic fairness in machine learning have been developed in the literature. This paper proposes the Continuous Fairness Algorithm (CFAθ\theta) which enables a continuous interpolation between different fairness definitions. More specifically, we make three main contributions to the existing literature. First, our approach allows the decision maker to continuously vary between specific concepts of individual and group fairness. As a consequence, the algorithm enables the decision maker to adopt intermediate ``worldviews'' on the degree of discrimination encoded in algorithmic processes, adding nuance to the extreme cases of ``we're all equal'' (WAE) and ``what you see is what you get'' (WYSIWYG) proposed so far in the literature. Second, we use optimal transport theory, and specifically the concept of the barycenter, to maximize decision maker utility under the chosen fairness constraints. Third, the algorithm is able to handle cases of intersectionality, i.e., of multi-dimensional discrimination of certain groups on grounds of several criteria. We discuss three main examples (credit applications; college admissions; insurance contracts) and map out the legal and policy implications of our approach. The explicit formalization of the trade-off between individual and group fairness allows this post-processing approach to be tailored to different situational contexts in which one or the other fairness criterion may take precedence. Finally, we evaluate our model experimentally.Comment: Vastly extended new version, now including computational experiment