3,318 research outputs found

    On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse

    Get PDF
    This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse

    A comparative study of evolutionary approaches to the bi-objective dynamic Travelling Thief Problem

    Get PDF
    Dynamic evolutionary multi-objective optimization is a thriving research area. Recent contributions span the development of specialized algorithms and the construction of challenging benchmark problems. Here, we continue these research directions through the development and analysis of a new bi-objective problem, the dynamic Travelling Thief Problem (TTP), including three modes of dynamic change: city locations, item profit values, and item availability. The interconnected problem components embedded in the dynamic problem dictate that the effective tracking of good trade-off solutions that satisfy both objectives throughout dynamic events is non-trivial. Consequently, we examine the relative contribution to the non-dominated set from a variety of population seeding strategies, including exact solvers and greedy algorithms for the knapsack and tour components, and random techniques. We introduce this responsive seeding extension within an evolutionary algorithm framework. The efficacy of alternative seeding mechanisms is evaluated across a range of exemplary problem instances using ranking-based and quantitative statistical comparisons, which combines performance measurements taken throughout the optimization. Our detailed experiments show that the different dynamic TTP instances present varying difficulty to the seeding methods tested. We posit the dynamic TTP as a suitable benchmark capable of generating problem instances with different controllable characteristics aligning with many real-world problems

    Fairness-aware Machine Learning in Educational Data Mining

    Get PDF
    Fairness is an essential requirement of every educational system, which is reflected in a variety of educational activities. With the extensive use of Artificial Intelligence (AI) and Machine Learning (ML) techniques in education, researchers and educators can analyze educational (big) data and propose new (technical) methods in order to support teachers, students, or administrators of (online) learning systems in the organization of teaching and learning. Educational data mining (EDM) is the result of the application and development of data mining (DM), and ML techniques to deal with educational problems, such as student performance prediction and student grouping. However, ML-based decisions in education can be based on protected attributes, such as race or gender, leading to discrimination of individual students or subgroups of students. Therefore, ensuring fairness in ML models also contributes to equity in educational systems. On the other hand, bias can also appear in the data obtained from learning environments. Hence, bias-aware exploratory educational data analysis is important to support unbiased decision-making in EDM. In this thesis, we address the aforementioned issues and propose methods that mitigate discriminatory outcomes of ML algorithms in EDM tasks. Specifically, we make the following contributions: We perform bias-aware exploratory analysis of educational datasets using Bayesian networks to identify the relationships among attributes in order to understand bias in the datasets. We focus the exploratory data analysis on features having a direct or indirect relationship with the protected attributes w.r.t. prediction outcomes. We perform a comprehensive evaluation of the sufficiency of various group fairness measures in predictive models for student performance prediction problems. A variety of experiments on various educational datasets with different fairness measures are performed to provide users with a broad view of unfairness from diverse aspects. We deal with the student grouping problem in collaborative learning. We introduce the fair-capacitated clustering problem that takes into account cluster fairness and cluster cardinalities. We propose two approaches, namely hierarchical clustering and partitioning-based clustering, to obtain fair-capacitated clustering. We introduce the multi-fair capacitated (MFC) students-topics grouping problem that satisfies students' preferences while ensuring balanced group cardinalities and maximizing the diversity of members regarding the protected attribute. We propose three approaches: a greedy heuristic approach, a knapsack-based approach using vanilla maximal 0-1 knapsack formulation, and an MFC knapsack approach based on group fairness knapsack formulation. In short, the findings described in this thesis demonstrate the importance of fairness-aware ML in educational settings. We show that bias-aware data analysis, fairness measures, and fairness-aware ML models are essential aspects to ensure fairness in EDM and the educational environment.Ministry of Science and Culture of Lower Saxony/LernMINT/51410078/E

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Provably Efficient Generalized Lagrangian Policy Optimization for Safe Multi-Agent Reinforcement Learning

    Full text link
    We examine online safe multi-agent reinforcement learning using constrained Markov games in which agents compete by maximizing their expected total rewards under a constraint on expected total utilities. Our focus is confined to an episodic two-player zero-sum constrained Markov game with independent transition functions that are unknown to agents, adversarial reward functions, and stochastic utility functions. For such a Markov game, we employ an approach based on the occupancy measure to formulate it as an online constrained saddle-point problem with an explicit constraint. We extend the Lagrange multiplier method in constrained optimization to handle the constraint by creating a generalized Lagrangian with minimax decision primal variables and a dual variable. Next, we develop an upper confidence reinforcement learning algorithm to solve this Lagrangian problem while balancing exploration and exploitation. Our algorithm updates the minimax decision primal variables via online mirror descent and the dual variable via projected gradient step and we prove that it enjoys sublinear rate O((X+Y)LT(A+B))) O((|X|+|Y|) L \sqrt{T(|A|+|B|)})) for both regret and constraint violation after playing TT episodes of the game. Here, LL is the horizon of each episode, (X,A)(|X|,|A|) and (Y,B)(|Y|,|B|) are the state/action space sizes of the min-player and the max-player, respectively. To the best of our knowledge, we provide the first provably efficient online safe reinforcement learning algorithm in constrained Markov games.Comment: 59 pages, a full version of the main paper in the 5th Annual Conference on Learning for Dynamics and Contro

    Data- og ekspertdreven variabelseleksjon for prediktive modeller i helsevesenet : mot økt tolkbarhet i underbestemte maskinlæringsproblemer

    Get PDF
    Modern data acquisition techniques in healthcare generate large collections of data from multiple sources, such as novel diagnosis and treatment methodologies. Some concrete examples are electronic healthcare record systems, genomics, and medical images. This leads to situations with often unstructured, high-dimensional heterogeneous patient cohort data where classical statistical methods may not be sufficient for optimal utilization of the data and informed decision-making. Instead, investigating such data structures with modern machine learning techniques promises to improve the understanding of patient health issues and may provide a better platform for informed decision-making by clinicians. Key requirements for this purpose include (a) sufficiently accurate predictions and (b) model interpretability. Achieving both aspects in parallel is difficult, particularly for datasets with few patients, which are common in the healthcare domain. In such cases, machine learning models encounter mathematically underdetermined systems and may overfit easily on the training data. An important approach to overcome this issue is feature selection, i.e., determining a subset of informative features from the original set of features with respect to the target variable. While potentially raising the predictive performance, feature selection fosters model interpretability by identifying a low number of relevant model parameters to better understand the underlying biological processes that lead to health issues. Interpretability requires that feature selection is stable, i.e., small changes in the dataset do not lead to changes in the selected feature set. A concept to address instability is ensemble feature selection, i.e. the process of repeating the feature selection multiple times on subsets of samples of the original dataset and aggregating results in a meta-model. This thesis presents two approaches for ensemble feature selection, which are tailored towards high-dimensional data in healthcare: the Repeated Elastic Net Technique for feature selection (RENT) and the User-Guided Bayesian Framework for feature selection (UBayFS). While RENT is purely data-driven and builds upon elastic net regularized models, UBayFS is a general framework for ensembles with the capabilities to include expert knowledge in the feature selection process via prior weights and side constraints. A case study modeling the overall survival of cancer patients compares these novel feature selectors and demonstrates their potential in clinical practice. Beyond the selection of single features, UBayFS also allows for selecting whole feature groups (feature blocks) that were acquired from multiple data sources, as those mentioned above. Importance quantification of such feature blocks plays a key role in tracing information about the target variable back to the acquisition modalities. Such information on feature block importance may lead to positive effects on the use of human, technical, and financial resources if systematically integrated into the planning of patient treatment by excluding the acquisition of non-informative features. Since a generalization of feature importance measures to block importance is not trivial, this thesis also investigates and compares approaches for feature block importance rankings. This thesis demonstrates that high-dimensional datasets from multiple data sources in the medical domain can be successfully tackled by the presented approaches for feature selection. Experimental evaluations demonstrate favorable properties of both predictive performance, stability, as well as interpretability of results, which carries a high potential for better data-driven decision support in clinical practice.Moderne datainnsamlingsteknikker i helsevesenet genererer store datamengder fra flere kilder, som for eksempel nye diagnose- og behandlingsmetoder. Noen konkrete eksempler er elektroniske helsejournalsystemer, genomikk og medisinske bilder. Slike pasientkohortdata er ofte ustrukturerte, høydimensjonale og heterogene og hvor klassiske statistiske metoder ikke er tilstrekkelige for optimal utnyttelse av dataene og god informasjonsbasert beslutningstaking. Derfor kan det være lovende å analysere slike datastrukturer ved bruk av moderne maskinlæringsteknikker for å øke forståelsen av pasientenes helseproblemer og for å gi klinikerne en bedre plattform for informasjonsbasert beslutningstaking. Sentrale krav til dette formålet inkluderer (a) tilstrekkelig nøyaktige prediksjoner og (b) modelltolkbarhet. Å oppnå begge aspektene samtidig er vanskelig, spesielt for datasett med få pasienter, noe som er vanlig for data i helsevesenet. I slike tilfeller må maskinlæringsmodeller håndtere matematisk underbestemte systemer og dette kan lett føre til at modellene overtilpasses treningsdataene. Variabelseleksjon er en viktig tilnærming for å håndtere dette ved å identifisere en undergruppe av informative variabler med hensyn til responsvariablen. Samtidig som variabelseleksjonsmetoder kan lede til økt prediktiv ytelse, fremmes modelltolkbarhet ved å identifisere et lavt antall relevante modellparametere. Dette kan gi bedre forståelse av de underliggende biologiske prosessene som fører til helseproblemer. Tolkbarhet krever at variabelseleksjonen er stabil, dvs. at små endringer i datasettet ikke fører til endringer i hvilke variabler som velges. Et konsept for å adressere ustabilitet er ensemblevariableseleksjon, dvs. prosessen med å gjenta variabelseleksjon flere ganger på en delmengde av prøvene i det originale datasett og aggregere resultater i en metamodell. Denne avhandlingen presenterer to tilnærminger for ensemblevariabelseleksjon, som er skreddersydd for høydimensjonale data i helsevesenet: "Repeated Elastic Net Technique for feature selection" (RENT) og "User-Guided Bayesian Framework for feature selection" (UBayFS). Mens RENT er datadrevet og bygger på elastic net-regulariserte modeller, er UBayFS et generelt rammeverk for ensembler som muliggjør inkludering av ekspertkunnskap i variabelseleksjonsprosessen gjennom forhåndsbestemte vekter og sidebegrensninger. En case-studie som modellerer overlevelsen av kreftpasienter sammenligner disse nye variabelseleksjonsmetodene og demonstrerer deres potensiale i klinisk praksis. Utover valg av enkelte variabler gjør UBayFS det også mulig å velge blokker eller grupper av variabler som representerer de ulike datakildene som ble nevnt over. Kvantifisering av viktigheten av variabelgrupper spiller en nøkkelrolle for forståelsen av hvorvidt datakildene er viktige for responsvariablen. Tilgang til slik informasjon kan føre til at bruken av menneskelige, tekniske og økonomiske ressurser kan forbedres dersom informasjonen integreres systematisk i planleggingen av pasientbehandlingen. Slik kan man redusere innsamling av ikke-informative variabler. Siden generaliseringen av viktighet av variabelgrupper ikke er triviell, undersøkes og sammenlignes også tilnærminger for rangering av viktigheten til disse variabelgruppene. Denne avhandlingen viser at høydimensjonale datasett fra flere datakilder fra det medisinske domenet effektivt kan håndteres ved bruk av variabelseleksjonmetodene som er presentert i avhandlingen. Eksperimentene viser at disse kan ha positiv en effekt på både prediktiv ytelse, stabilitet og tolkbarhet av resultatene. Bruken av disse variabelseleksjonsmetodene bærer et stort potensiale for bedre datadrevet beslutningsstøtte i klinisk praksis

    Adjustable robust optimization with nonlinear recourses

    Get PDF
    Over the last century, mathematical optimization has become a prominent tool for decision making. Its systematic application in practical fields such as economics, logistics or defense led to the development of algorithmic methods with ever increasing efficiency. Indeed, for a variety of real-world problems, finding an optimal decision among a set of (implicitly or explicitly) predefined alternatives has become conceivable in reasonable time. In the last decades, however, the research community raised more and more attention to the role of uncertainty in the optimization process. In particular, one may question the notion of optimality, and even feasibility, when studying decision problems with unknown or imprecise input parameters. This concern is even more critical in a world becoming more and more complex —by which we intend, interconnected —where each individual variation inside a system inevitably causes other variations in the system itself. In this dissertation, we study a class of optimization problems which suffer from imprecise input data and feature a two-stage decision process, i.e., where decisions are made in a sequential order —called stages —and where unknown parameters are revealed throughout the stages. The applications of such problems are plethora in practical fields such as, e.g., facility location problems with uncertain demands, transportation problems with uncertain costs or scheduling under uncertain processing times. The uncertainty is dealt with a robust optimization (RO) viewpoint (also known as "worst-case perspective") and we present original contributions to the RO literature on both the theoretical and practical side

    Matheuristics:survey and synthesis

    Get PDF
    In integer programming and combinatorial optimisation, people use the term matheuristics to refer to methods that are heuristic in nature, but draw on concepts from the literature on exact methods. We survey the literature on this topic, with a particular emphasis on matheuristics that yield both primal and dual bounds (i.e., upper and lower bounds in the case of a minimisation problem). We also make some comments about possible future developments

    Towards the reduction of greenhouse gas emissions : models and algorithms for ridesharing and carbon capture and storage

    Full text link
    Avec la ratification de l'Accord de Paris, les pays se sont engagés à limiter le réchauffement climatique bien en dessous de 2, de préférence à 1,5 degrés Celsius, par rapport aux niveaux préindustriels. À cette fin, les émissions anthropiques de gaz à effet de serre (GES, tels que CO2) doivent être réduites pour atteindre des émissions nettes de carbone nulles d'ici 2050. Cet objectif ambitieux peut être atteint grâce à différentes stratégies d'atténuation des GES, telles que l'électrification, les changements de comportement des consommateurs, l'amélioration de l'efficacité énergétique des procédés, l'utilisation de substituts aux combustibles fossiles (tels que la bioénergie ou l'hydrogène), le captage et le stockage du carbone (CSC), entre autres. Cette thèse vise à contribuer à deux de ces stratégies : le covoiturage (qui appartient à la catégorie des changements de comportement du consommateur) et la capture et le stockage du carbone. Cette thèse fournit des modèles mathématiques et d'optimisation et des algorithmes pour la planification opérationnelle et tactique des systèmes de covoiturage, et des heuristiques pour la planification stratégique d'un réseau de captage et de stockage du carbone. Dans le covoiturage, les émissions sont réduites lorsque les individus voyagent ensemble au lieu de conduire seuls. Dans ce contexte, cette thèse fournit de nouveaux modèles mathématiques pour représenter les systèmes de covoiturage, allant des problèmes d'affectation stochastique à deux étapes aux problèmes d'empaquetage d'ensembles stochastiques à deux étapes qui peuvent représenter un large éventail de systèmes de covoiturage. Ces modèles aident les décideurs dans leur planification opérationnelle des covoiturages, où les conducteurs et les passagers doivent être jumelés pour le covoiturage à court terme. De plus, cette thèse explore la planification tactique des systèmes de covoiturage en comparant différents modes de fonctionnement du covoiturage et les paramètres de la plateforme (par exemple, le partage des revenus et les pénalités). De nouvelles caractéristiques de problèmes sont étudiées, telles que l'incertitude du conducteur et du passager, la flexibilité de réappariement et la réservation de l'offre de conducteur via les frais de réservation et les pénalités. En particulier, la flexibilité de réappariement peut augmenter l'efficacité d'une plateforme de covoiturage, et la réservation de l'offre de conducteurs via les frais de réservation et les pénalités peut augmenter la satisfaction des utilisateurs grâce à une compensation garantie si un covoiturage n'est pas fourni. Des expériences computationnelles détaillées sont menées et des informations managériales sont fournies. Malgré la possibilité de réduction des émissions grâce au covoiturage et à d'autres stratégies d'atténuation, des études macroéconomiques mondiales montrent que même si plusieurs stratégies d'atténuation des GES sont utilisées simultanément, il ne sera probablement pas possible d'atteindre des émissions nettes nulles d'ici 2050 sans le CSC. Ici, le CO2 est capturé à partir des sites émetteurs et transporté vers des réservoirs géologiques, où il est injecté pour un stockage à long terme. Cette thèse considère un problème de planification stratégique multipériode pour l'optimisation d'une chaîne de valeur CSC. Ce problème est un problème combiné de localisation des installations et de conception du réseau où une infrastructure CSC est prévue pour les prochaines décennies. En raison des défis informatiques associés à ce problème, une heuristique est introduite, qui est capable de trouver de meilleures solutions qu'un solveur commercial de programmation mathématique, pour une fraction du temps de calcul. Cette heuristique comporte des phases d'intensification et de diversification, une génération améliorée de solutions réalisables par programmation dynamique, et une étape finale de raffinement basée sur un modèle restreint. Dans l'ensemble, les contributions de cette thèse sur le covoiturage et le CSC fournissent des modèles de programmation mathématique, des algorithmes et des informations managériales qui peuvent aider les praticiens et les parties prenantes à planifier des émissions nettes nulles.With the ratification of the Paris Agreement, countries committed to limiting global warming to well below 2, preferably to 1.5 degrees Celsius, compared to pre-industrial levels. To this end, anthropogenic greenhouse gas (GHG) emissions (such as CO2) must be reduced to reach net-zero carbon emissions by 2050. This ambitious target may be met by means of different GHG mitigation strategies, such as electrification, changes in consumer behavior, improving the energy efficiency of processes, using substitutes for fossil fuels (such as bioenergy or hydrogen), and carbon capture and storage (CCS). This thesis aims at contributing to two of these strategies: ridesharing (which belongs to the category of changes in consumer behavior) and carbon capture and storage. This thesis provides mathematical and optimization models and algorithms for the operational and tactical planning of ridesharing systems, and heuristics for the strategic planning of a carbon capture and storage network. In ridesharing, emissions are reduced when individuals travel together instead of driving alone. In this context, this thesis provides novel mathematical models to represent ridesharing systems, ranging from two-stage stochastic assignment problems to two-stage stochastic set packing problems that can represent a wide variety of ridesharing systems. These models aid decision makers in their operational planning of rideshares, where drivers and riders have to be matched for ridesharing on the short-term. Additionally, this thesis explores the tactical planning of ridesharing systems by comparing different modes of ridesharing operation and platform parameters (e.g., revenue share and penalties). Novel problem characteristics are studied, such as driver and rider uncertainty, rematching flexibility, and reservation of driver supply through booking fees and penalties. In particular, rematching flexibility may increase the efficiency of a ridesharing platform, and the reservation of driver supply through booking fees and penalties may increase user satisfaction through guaranteed compensation if a rideshare is not provided. Extensive computational experiments are conducted and managerial insights are given. Despite the opportunity to reduce emissions through ridesharing and other mitigation strategies, global macroeconomic studies show that even if several GHG mitigation strategies are used simultaneously, achieving net-zero emissions by 2050 will likely not be possible without CCS. Here, CO2 is captured from emitter sites and transported to geological reservoirs, where it is injected for long-term storage. This thesis considers a multiperiod strategic planning problem for the optimization of a CCS value chain. This problem is a combined facility location and network design problem where a CCS infrastructure is planned for the next decades. Due to the computational challenges associated with that problem, a slope scaling heuristic is introduced, which is capable of finding better solutions than a state-of-the-art general-purpose mathematical programming solver, at a fraction of the computational time. This heuristic has intensification and diversification phases, improved generation of feasible solutions through dynamic programming, and a final refining step based on a restricted model. Overall, the contributions of this thesis on ridesharing and CCS provide mathematical programming models, algorithms, and managerial insights that may help practitioners and stakeholders plan for net-zero emissions

    Toward Efficient and Robust Computer Vision for Large-Scale Edge Applications

    Get PDF
    The past decade has been witnessing remarkable advancements in computer vision and deep learning algorithms, ushering in a transformative wave of large-scale edge applications across various industries. These image processing methods, however, still encounter numerous challenges when it comes to meeting real-world demands, especially in terms of accuracy and latency at scale. Indeed, striking a balance among efficiency, robustness, and scalability remains a common obstacle. This dissertation investigates these issues in the context of different computer vision tasks, including image classification, semantic segmentation, depth estimation, and object detection. We introduce novel solutions, focusing on utilizing adjustable neural networks, joint multi-task architecture search, and generalized supervision interpolation. The first obstacle revolves around the ability to trade off between speed and accuracy in convolutional neural networks (CNNs) during inference on resource-constrained platforms. Despite their progress, CNNs are typically monolithic at runtime, which can present practical difficulties since computational budgets may vary over time. To address this, we introduce Any-Width Network, an adjustable-width CNN architecture that utilizes a novel Triangular Convolution module to enable fine-grained control over speed and accuracy during inference. The second challenge focuses on the computationally demanding nature of dense prediction tasks such as semantic segmentation and depth estimation. This issue becomes especially problematic for edge platforms with limited resources. To tackle this, we propose a novel and scalable framework named EDNAS. EDNAS leverages the synergistic relationship between Multi-Task Learning and hardware-aware Neural Architecture Search to significantly enhance on-device speed and accuracy of dense predictions. Finally, to improve the robustness of object detection, we introduce a novel data mixing augmentation. While mixing techniques such as Mixup have proven successful in image classification, their application to object detection is non-trivial due to spatial misalignment, foreground/background distinction, and instance multiplicity. To address these issues, we propose a generalized data mixing principle, Supervision Interpolation, and its simple yet effective implementation, LossMix. By addressing these challenges, this dissertation aims to facilitate better efficiency, accuracy, and scalability of computer vision and deep learning algorithms and contribute to the advancement of large-scale edge applications across different domains.Doctor of Philosoph
    corecore