    A Performance-Explainability-Fairness Framework For Benchmarking ML Models

    Machine learning (ML) models have achieved remarkable success in various applications; however, ensuring their robustness and fairness remains a critical challenge. In this research, we present a comprehensive framework designed to evaluate and benchmark ML models through the lenses of performance, explainability, and fairness. This framework addresses the increasing need for a holistic assessment of ML models, considering not only their predictive power but also their interpretability and equitable deployment. The proposed framework leverages a multi-faceted evaluation approach, integrating performance metrics with explainability and fairness assessments. Performance evaluation incorporates standard measures such as accuracy, precision, and recall, but extends to overall balanced error rate, overall area under the receiver operating characteristic (ROC) curve (AUC), to capture model behavior across different performance aspects. Explainability assessment employs state-of-the-art techniques to quantify the interpretability of model decisions, ensuring that model behavior can be understood and trusted by stakeholders. The fairness evaluation examines model predictions in terms of demographic parity, equalized odds, thereby addressing concerns of bias and discrimination in the deployment of ML systems. To demonstrate the practical utility of the framework, we apply it to a diverse set of ML algorithms across various functional domains, including finance, criminology, education, and healthcare prediction. The results showcase the importance of a balanced evaluation approach, revealing trade-offs between performance, explainability, and fairness that can inform model selection and deployment decisions. Furthermore, we provide insights into the analysis of tradeoffs in selecting the appropriate model for use cases where performance, interpretability and fairness are important. In summary, the Performance-Explainability-Fairness Framework offers a unified methodology for evaluating and benchmarking ML models, enabling practitioners and researchers to make informed decisions about model suitability and ensuring responsible and equitable AI deployment. We believe that this framework represents a crucial step towards building trustworthy and accountable ML systems in an era where AI plays an increasingly prominent role in decision-making processes

    Robust Fairness under Covariate Shift

    Making predictions that are fair with regard to protected group membership (race, gender, age, etc.) has become an important requirement for classification algorithms. Existing techniques derive a fair model from sampled labeled data relying on the assumption that training and testing data are identically and independently drawn (iid) from the same distribution. In practice, distribution shift can and does occur between training and testing datasets as the characteristics of individuals interacting with the machine learning system change. We investigate fairness under covariate shift, a relaxation of the iid assumption in which the inputs or covariates change while the conditional label distribution remains the same. We seek fair decisions under these assumptions on target data with unknown labels. We propose an approach that obtains the predictor that is robust to the worst-case in terms of target performance while satisfying target fairness requirements and matching statistical properties of the source data. We demonstrate the benefits of our approach on benchmark prediction tasks

    Learning Provably Useful Representations, with Applications to Fairness

    Representation learning involves transforming data so that it is useful for solving a particular supervised learning problem. The aim is to learn a representation function which maps inputs to some representation space, and an hypothesis which maps the representation space to targets. It is possible to learn a representation function using unlabeled data or data from a probability distribution other than that of the main problem of interest, which is helpful if labeled data is scarce. This approach has been successfully applied in practice, for example through pre-trained neural networks in computer vision and word embeddings in natural language processing. This thesis explores when it is possible to learn representations that are provably useful. We consider learning a representation function from unlabeled data, and propose an approach to identifying conditions where this technique will be useful for a subsequent supervised learning task. The approach requires shared structure in the labeled and unlabeled distributions, as well as a compatible representation function class and hypothesis class. We provide an example where representation learning can exploit cluster structure present in the data. We also consider learning a representation function from a source task distribution and re-using it on a target task of interest, and again propose conditions where this approach will be successful. In this case the conditions depend on shared structure between source and target task distributions. We provide an example involving the transfer of weights in a two-layer feedforward neural network. Representation learning can be applied to another topic of interest: fairness in machine learning. The issue of fairness arises when machine learning systems make or provide advice on decisions about people. A common approach to defining fairness is measuring differences in decisions made by an algorithm for one demographic group compared to another. One approach to preventing discrimination against particular groups is to learn a representation of the data from which it is not possible for an adversary to determine an individual's group membership, but which preserves other useful information. We quantify the costs and benefits of such an approach with respect to several possible fairness definitions. We also examine the relationships between different definitions of fairness and show cases where they cannot simultaneously be satisfied. We explore the use of representation learning for fairness through two case studies: predicting domestic violence recidivism while avoiding discrimination on the basis of race, and predicting student outcomes at university while avoiding discrimination on the basis of gender. Our case studies reveal both the utility of fair representation learning and the trade-offs between accuracy and the definitions of fairness considered

    Algorithmic fairness and bias mitigation for clinical machine learning with deep reinforcement learning

    As models based on machine learning continue to be developed for healthcare applications, greater effort is needed to ensure that these technologies do not reflect or exacerbate any unwanted or discriminatory biases that may be present in the data. Here we introduce a reinforcement learning framework capable of mitigating biases that may have been acquired during data collection. In particular, we evaluated our model for the task of rapidly predicting COVID-19 for patients presenting to hospital emergency departments and aimed to mitigate any site (hospital)-specific and ethnicity-based biases present in the data. Using a specialized reward function and training procedure, we show that our method achieves clinically effective screening performances, while significantly improving outcome fairness compared with current benchmarks and state-of-the-art machine learning methods. We performed external validation across three independent hospitals, and additionally tested our method on a patient intensive care unit discharge status task, demonstrating model generalizability

    When Fair Classification Meets Noisy Protected Attributes

    The operationalization of algorithmic fairness comes with several practical challenges, not the least of which is the availability or reliability of protected attributes in datasets. In real-world contexts, practical and legal impediments may prevent the collection and use of demographic data, making it difficult to ensure algorithmic fairness. While initial fairness algorithms did not consider these limitations, recent proposals aim to achieve algorithmic fairness in classification by incorporating noisiness in protected attributes or not using protected attributes at all. To the best of our knowledge, this is the first head-to-head study of fair classification algorithms to compare attribute-reliant, noise-tolerant and attribute-blind algorithms along the dual axes of predictivity and fairness. We evaluated these algorithms via case studies on four real-world datasets and synthetic perturbations. Our study reveals that attribute-blind and noise-tolerant fair classifiers can potentially achieve similar level of performance as attribute-reliant algorithms, even when protected attributes are noisy. However, implementing them in practice requires careful nuance. Our study provides insights into the practical implications of using fair classification algorithms in scenarios where protected attributes are noisy or partially available.Comment: Accepted at the 6th AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES) 202

    FaiREE: Fair Classification with Finite-Sample and Distribution-Free Guarantee

    Algorithmic fairness plays an increasingly critical role in machine learning research. Several group fairness notions and algorithms have been proposed. However, the fairness guarantee of existing fair classification methods mainly depends on specific data distributional assumptions, often requiring large sample sizes, and fairness could be violated when there is a modest number of samples, which is often the case in practice. In this paper, we propose FaiREE, a fair classification algorithm that can satisfy group fairness constraints with finite-sample and distribution-free theoretical guarantees. FaiREE can be adapted to satisfy various group fairness notions (e.g., Equality of Opportunity, Equalized Odds, Demographic Parity, etc.) and achieve the optimal accuracy. These theoretical guarantees are further supported by experiments on both synthetic and real data. FaiREE is shown to have favorable performance over state-of-the-art algorithms.Comment: 45 pages, 9 figure

    Towards Better Fairness-Utility Trade-off: A Comprehensive Measurement-Based Reinforcement Learning Framework

    Machine learning is widely used to make decisions with societal impact such as bank loan approving, criminal sentencing, and resume filtering. How to ensure its fairness while maintaining utility is a challenging but crucial issue. Fairness is a complex and context-dependent concept with over 70 different measurement metrics. Since existing regulations are often vague in terms of which metric to use and different organizations may prefer different fairness metrics, it is important to have means of improving fairness comprehensively. Existing mitigation techniques often target at one specific fairness metric and have limitations in improving multiple notions of fairness simultaneously. In this work, we propose CFU (Comprehensive Fairness-Utility), a reinforcement learning-based framework, to efficiently improve the fairness-utility trade-off in machine learning classifiers. A comprehensive measurement that can simultaneously consider multiple fairness notions as well as utility is established, and new metrics are proposed based on an in-depth analysis of the relationship between different fairness metrics. The reward function of CFU is constructed with comprehensive measurement and new metrics. We conduct extensive experiments to evaluate CFU on 6 tasks, 3 machine learning models, and 15 fairness-utility measurements. The results demonstrate that CFU can improve the classifier on multiple fairness metrics without sacrificing its utility. It outperforms all state-of-the-art techniques and has witnessed a 37.5% improvement on average

    Investigating Trade-offs For Fair Machine Learning Systems

    Fairness in software systems aims to provide algorithms that operate in a nondiscriminatory manner, with respect to protected attributes such as gender, race, or age. Ensuring fairness is a crucial non-functional property of data-driven Machine Learning systems. Several approaches (i.e., bias mitigation methods) have been proposed in the literature to reduce bias of Machine Learning systems. However, this often comes hand in hand with performance deterioration. Therefore, this thesis addresses trade-offs that practitioners face when debiasing Machine Learning systems. At first, we perform a literature review to investigate the current state of the art for debiasing Machine Learning systems. This includes an overview of existing debiasing techniques and how they are evaluated (e.g., how is bias measured). As a second contribution, we propose a benchmarking approach that allows for an evaluation and comparison of bias mitigation methods and their trade-offs (i.e., how much performance is sacrificed for improving fairness). Afterwards, we propose a debiasing method ourselves, which modifies already trained Machine Learning models, with the goal to improve both, their fairness and accuracy. Moreover, this thesis addresses the challenge of how to deal with fairness with regards to age. This question is answered with an empirical evaluation on real-world datasets

    Robust Machine Learning by Transforming and Augmenting Imperfect Training Data

    Machine Learning (ML) is an expressive framework for turning data into computer programs. Across many problem domains -- both in industry and policy settings -- the types of computer programs needed for accurate prediction or optimal control are difficult to write by hand. On the other hand, collecting instances of desired system behavior may be relatively more feasible. This makes ML broadly appealing, but also induces data sensitivities that often manifest as unexpected failure modes during deployment. In this sense, the training data available tend to be imperfect for the task at hand. This thesis explores several data sensitivities of modern machine learning and how to address them. We begin by discussing how to prevent ML from codifying prior human discrimination measured in the training data, where we take a fair representation learning approach. We then discuss the problem of learning from data containing spurious features, which provide predictive fidelity during training but are unreliable upon deployment. Here we observe that insofar as standard training methods tend to learn such features, this propensity can be leveraged to search for partitions of training data that expose this inconsistency, ultimately promoting learning algorithms invariant to spurious features. Finally, we turn our attention to reinforcement learning from data with insufficient coverage over all possible states and actions. To address the coverage issue, we discuss how causal priors can be used to model the single-step dynamics of the setting where data are collected. This enables a new type of data augmentation where observed trajectories are stitched together to produce new but plausible counterfactual trajectories.Comment: A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy, Department of Computer Science, University of Toront
