94,854 research outputs found

    Achieving non-discrimination in prediction

    Full text link
    Discrimination-aware classification is receiving an increasing attention in data science fields. The pre-process methods for constructing a discrimination-free classifier first remove discrimination from the training data, and then learn the classifier from the cleaned data. However, they lack a theoretical guarantee for the potential discrimination when the classifier is deployed for prediction. In this paper, we fill this gap by mathematically bounding the probability of the discrimination in prediction being within a given interval in terms of the training data and classifier. We adopt the causal model for modeling the data generation mechanism, and formally defining discrimination in population, in a dataset, and in prediction. We obtain two important theoretical results: (1) the discrimination in prediction can still exist even if the discrimination in the training data is completely removed; and (2) not all pre-process methods can ensure non-discrimination in prediction even though they can achieve non-discrimination in the modified training data. Based on the results, we develop a two-phase framework for constructing a discrimination-free classifier with a theoretical guarantee. The experiments demonstrate the theoretical results and show the effectiveness of our two-phase framework

    Fair Inputs and Fair Outputs: The Incompatibility of Fairness in Privacy and Accuracy

    Get PDF
    Fairness concerns about algorithmic decision-making systems have been mainly focused on the outputs (e.g., the accuracy of a classifier across individuals or groups). However, one may additionally be concerned with fairness in the inputs. In this paper, we propose and formulate two properties regarding the inputs of (features used by) a classifier. In particular, we claim that fair privacy (whether individuals are all asked to reveal the same information) and need-to-know (whether users are only asked for the minimal information required for the task at hand) are desirable properties of a decision system. We explore the interaction between these properties and fairness in the outputs (fair prediction accuracy). We show that for an optimal classifier these three properties are in general incompatible, and we explain what common properties of data make them incompatible. Finally we provide an algorithm to verify if the trade-off between the three properties exists in a given dataset, and use the algorithm to show that this trade-off is common in real data

    Matching Code and Law: Achieving Algorithmic Fairness with Optimal Transport

    Full text link
    Increasingly, discrimination by algorithms is perceived as a societal and legal problem. As a response, a number of criteria for implementing algorithmic fairness in machine learning have been developed in the literature. This paper proposes the Continuous Fairness Algorithm (CFAθ\theta) which enables a continuous interpolation between different fairness definitions. More specifically, we make three main contributions to the existing literature. First, our approach allows the decision maker to continuously vary between specific concepts of individual and group fairness. As a consequence, the algorithm enables the decision maker to adopt intermediate ``worldviews'' on the degree of discrimination encoded in algorithmic processes, adding nuance to the extreme cases of ``we're all equal'' (WAE) and ``what you see is what you get'' (WYSIWYG) proposed so far in the literature. Second, we use optimal transport theory, and specifically the concept of the barycenter, to maximize decision maker utility under the chosen fairness constraints. Third, the algorithm is able to handle cases of intersectionality, i.e., of multi-dimensional discrimination of certain groups on grounds of several criteria. We discuss three main examples (credit applications; college admissions; insurance contracts) and map out the legal and policy implications of our approach. The explicit formalization of the trade-off between individual and group fairness allows this post-processing approach to be tailored to different situational contexts in which one or the other fairness criterion may take precedence. Finally, we evaluate our model experimentally.Comment: Vastly extended new version, now including computational experiment

    Are situation awareness and decision-making in driving totally conscious processes? Results of a Hazard Prediction task

    Get PDF
    Detecting danger in the driving environment is an indispensable task to guarantee safety which depends on the driver's ability to predict upcoming hazards. But does correct prediction lead to an appropriate response? This study advances hazard perception research by investigating the link between successful prediction and response selection. Three groups of drivers (learners, novices and experienced drivers) were recruited, with novice and experienced drivers further split into offender and non-offender groups. Specifically, this works aims to develop an improved Spanish Hazard Prediction Test and to explore the differences in Situation Awareness, (SA: perception, comprehension and prediction) and Decision-Making ("DM") among learners, younger inexperienced and experienced drivers and between driving offenders and non-offenders. The contribution of the current work is not only theoretical; the Hazard Prediction Test is also a valid way to test Hazard Perception. The test, as well as being useful as part of the test for a driving license, could also serve a purpose in the renewal of licenses after a ban or as a way of training drivers. A sample of 121 participants watched a series of driving video clips that ended with a sudden occlusion prior to a hazard. They then answered questions to assess their SA ("What is the hazard?" "Where is it located?" "What happens next?") and "DM" ("What would you do in this situation?"). This alternative to the Hazard Perception Test demonstrates a satisfactory internal consistency (Alpha=0.750), with eleven videos achieving discrimination indices above 0.30. Learners performed significantly worse than experienced drivers when required to identify and locate the hazard. Interestingly, drivers were more accurate in answering the "DM" question than questions regarding SA, suggesting that drivers can choose an appropriate response manoeuvre without a totally conscious knowledge of the exact hazard
    • …
    corecore