311 research outputs found

    Prediction without Preclusion: Recourse Verification with Reachable Sets

    Full text link
    Machine learning models are often used to decide who will receive a loan, a job interview, or a public benefit. Standard techniques to build these models use features about people but overlook their actionability. In turn, models can assign predictions that are fixed, meaning that consumers who are denied loans, interviews, or benefits may be permanently locked out from access to credit, employment, or assistance. In this work, we introduce a formal testing procedure to flag models that assign fixed predictions that we call recourse verification. We develop machinery to reliably determine if a given model can provide recourse to its decision subjects from a set of user-specified actionability constraints. We demonstrate how our tools can ensure recourse and adversarial robustness in real-world datasets and use them to study the infeasibility of recourse in real-world lending datasets. Our results highlight how models can inadvertently assign fixed predictions that permanently bar access, and we provide tools to design algorithms that account for actionability when developing models

    Responsible and Regulatory Conform Machine Learning for Medicine: A Survey of Challenges and Solutions

    Full text link
    Machine learning is expected to fuel significant improvements in medical care. To ensure that fundamental principles such as beneficence, respect for human autonomy, prevention of harm, justice, privacy, and transparency are respected, medical machine learning systems must be developed responsibly. Many high-level declarations of ethical principles have been put forth for this purpose, but there is a severe lack of technical guidelines explicating the practical consequences for medical machine learning. Similarly, there is currently considerable uncertainty regarding the exact regulatory requirements placed upon medical machine learning systems. This survey provides an overview of the technical and procedural challenges involved in creating medical machine learning systems responsibly and in conformity with existing regulations, as well as possible solutions to address these challenges. First, a brief review of existing regulations affecting medical machine learning is provided, showing that properties such as safety, robustness, reliability, privacy, security, transparency, explainability, and nondiscrimination are all demanded already by existing law and regulations - albeit, in many cases, to an uncertain degree. Next, the key technical obstacles to achieving these desirable properties are discussed, as well as important techniques to overcome these obstacles in the medical context. We notice that distribution shift, spurious correlations, model underspecification, uncertainty quantification, and data scarcity represent severe challenges in the medical context. Promising solution approaches include the use of large and representative datasets and federated learning as a means to that end, the careful exploitation of domain knowledge, the use of inherently transparent models, comprehensive out-of-distribution model testing and verification, as well as algorithmic impact assessments

    ISIPTA'07: Proceedings of the Fifth International Symposium on Imprecise Probability: Theories and Applications

    Get PDF
    B

    Cumulative Distribution Functions As The Foundation For Probabilistic Models

    Get PDF
    This thesis discusses applications of probabilistic and connectionist models for constructing and training cumulative distribution functions (CDFs). First, it is shown how existing tools from the copula literature can be combined to build probabilistic models. It is found that this simple construction leads to numerical and scalability issues that make training and inference challenging. Next, several innovative ideas, combining neural networks, automatic differentiation and copula functions, introduce how to assemble black-box probabilistic models. The basic building block is a cumulative distribution function that is straightforward to construct, composed of arithmetic operations and nonlinear functions. There is no need to assume any specific parametric probability density function (PDF), making the model flexible and normalisation unnecessary. The only requirement is to design a computational graph that parameterises monotonically non-decreasing functions with a constrained range. Training can be then performed using standard tools from any neural network software library. Finally, factorial hidden Markov models (FHMMs) for sequential data are presented. It is shown how to leverage cumulative distribution functions in the form of the Gaussian copula and amortised stochastic variational method to encode hidden Markov chains coherently. This approach enables efficient learning and inference to model long sequences of high-dimensional data with long-range dependencies. Tackling such complex problems was impossible with the established FHMM approximate inference algorithm. It is empirically verified on several problems that some of the estimators introduced in this work can perform comparably or better than the currently popular models. Especially for tasks requiring tail-area or marginal probabilities that can be read directly from a cumulative distribution function

    Game theoretic and machine learning techniques for balancing games

    Get PDF
    Game balance is the problem of determining the fairness of actions or sets of actions in competitive, multiplayer games. This problem primarily arises in the context of designing board and video games. Traditionally, balance has been achieved through large amounts of play-testing and trial-and-error on the part of the designers. In this thesis, it is our intent to lay down the beginnings of a framework for a formal and analytical solution to this problem, combining techniques from game theory and machine learning. We first develop a set of game-theoretic definitions for different forms of balance, and then introduce the concept of a strategic abstraction. We show how machine classification techniques can be used to identify high-level player strategy in games, using the two principal methods of sequence alignment and Naive Bayes classification. Bioinformatics sequence alignment, when combined with a 3-nearest neighbor classification approach, can, with only 3 exemplars of each strategy, correctly identify the strategy used in 55\% of cases using all data, and 77\% of cases on data that experts indicated actually had a strategic class. Naive Bayes classification achieves similar results, with 65\% accuracy on all data and 75\% accuracy on data rated to have an actual class. We then show how these game theoretic and machine learning techniques can be combined to automatically build matrices that can be used to analyze game balance properties
    corecore