1,294 research outputs found

    On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse

    Get PDF
    This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Pure Message Passing Can Estimate Common Neighbor for Link Prediction

    Full text link
    Message Passing Neural Networks (MPNNs) have emerged as the {\em de facto} standard in graph representation learning. However, when it comes to link prediction, they often struggle, surpassed by simple heuristics such as Common Neighbor (CN). This discrepancy stems from a fundamental limitation: while MPNNs excel in node-level representation, they stumble with encoding the joint structural features essential to link prediction, like CN. To bridge this gap, we posit that, by harnessing the orthogonality of input vectors, pure message-passing can indeed capture joint structural features. Specifically, we study the proficiency of MPNNs in approximating CN heuristics. Based on our findings, we introduce the Message Passing Link Predictor (MPLP), a novel link prediction model. MPLP taps into quasi-orthogonal vectors to estimate link-level structural features, all while preserving the node-level complexities. Moreover, our approach demonstrates that leveraging message-passing to capture structural features could offset MPNNs' expressiveness limitations at the expense of estimation variance. We conduct experiments on benchmark datasets from various domains, where our method consistently outperforms the baseline methods.Comment: preprin

    A reduced order modeling methodology for the parametric estimation and optimization of aviation noise

    Get PDF
    The successful mitigation of aviation noise is one of the key enablers of sustainable aviation growth. Technological improvements for noise reduction at the source have been countered by increasing number of operations at most airports. There are several consequences of aviation noise including direct health effects, effects on human and non-human environments, and economic costs. Several mitigation strategies exist including reduction of noise at source, land-use planning and management, noise abatement operational procedures, and operating restrictions. Most noise management programs at airports use a combination of such mitigation measures. To assess the efficacy of noise mitigation measures, a robust modeling and simulation capability is required. Due to the large number of factors which can influence aviation noise metrics, current state-of-the-art tools rely on physics-based and semi-empirical models. These models help in accurately predicting noise metrics in a wide range of scenarios; however, they are computationally expensive to evaluate. Therefore, current noise mitigation studies are limited to singular applications such as annual average day noise quantification. Many-query applications such as parametric trade-off analyses and optimization remain elusive with the current generation of tools and methods. There are several efforts documented in literature which attempt to speed up the process using surrogate models. Techniques include the use of pre-computed noise grids with calibration models for non-standard conditions. These techniques are typically predicated on simplifying assumptions which greatly limit the applicability of such models. Simplifying assumptions are needed to downsize the number influencing factors to be modeled and make the problem tractable. Existing efforts also suffer due to the inclusion of categorical variables for operational profiles which are not conducive to surrogate modeling. In this research, a methodology is developed to address the inherent complexities of the noise quantification process, and thus enable rapid noise modeling capabilities which can facilitate parametric trade-off analysis and optimization efforts. To achieve this objective, a research plan is developed and executed to address two major gaps in literature. First, a parametric representation of operational profiles is proposed to replace existing categorical descriptions. A technique is developed to allow real-world flight data to be efficiently mapped onto this parametric definition. A trajectory clustering method is used to group similar flights and representative flights are parametrized using an inverse-map of an aircraft performance model. Next, a field surrogate modeling method is developed based on Model Order Reduction techniques to reduce the high dimensionality of computed noise metric results. This greatly reduces the complexity of data to be modeled, and thus enables rapid noise quantification. With these two gaps addressed, the overall methodology is developed for rapid noise quantification and optimization. This methodology is demonstrated on a case study where a large number of real-world flight trajectories are efficiently modeled for their noise results. As each such flight trajectory has a unique representation, and typically lacks thrust information, such noise modeling is not computationally feasible with existing methods and tools. The developed parametric representations and field surrogate modeling capabilities enable such an application.Ph.D

    Exploring Hyperspectral Imaging and 3D Convolutional Neural Network for Stress Classification in Plants

    Get PDF
    Hyperspectral imaging (HSI) has emerged as a transformative technology in imaging, characterized by its ability to capture a wide spectrum of light, including wavelengths beyond the visible range. This approach significantly differs from traditional imaging methods such as RGB imaging, which uses three color channels, and multispectral imaging, which captures several discrete spectral bands. Through this approach, HSI offers detailed spectral signatures for each pixel, facilitating a more nuanced analysis of the imaged subjects. This capability is particularly beneficial in applications like agricultural practices, where it can detect changes in physiological and structural characteristics of crops. Moreover, the ability of HSI to monitor these changes over time is advantageous for observing how subjects respond to different environmental conditions or treatments. However, the high-dimensional nature of hyperspectral data presents challenges in data processing and feature extraction. Traditional machine learning algorithms often struggle to handle such complexity. This is where 3D Convolutional Neural Networks (CNNs) become valuable. Unlike 1D-CNNs, which extract features from spectral dimensions, and 2D-CNNs, which focus on spatial dimensions, 3D CNNs have the capability to process data across both spectral and spatial dimensions. This makes them adept at extracting complex features from hyperspectral data. In this thesis, we explored the potency of HSI combined with 3D-CNN in agriculture domain where plant health and vitality are paramount. To evaluate this, we subjected lettuce plants to varying stress levels to assess the performance of this method in classifying the stressed lettuce at the early stages of growth into their respective stress-level groups. For this study, we created a dataset comprising 88 hyperspectral image samples of stressed lettuce. Utilizing Bayesian optimization, we developed 350 distinct 3D-CNN models to assess the method. The top-performing model achieved a 75.00\% test accuracy. Additionally, we addressed the challenge of generating valid 3D-CNN models in the Keras Tuner library through meticulous hyperparameter configuration. Our investigation also extends to the role of individual channels and channel groups within the color and near-infrared spectrum in predicting results for each stress-level group. We observed that the red and green spectra have a higher influence on the prediction results. Furthermore, we conducted a comprehensive review of 3D-CNN-based classification techniques for diseased and defective crops using non-UAV-based hyperspectral images.MITACSMaster of Science in Applied Computer Scienc

    Optimising multimodal fusion for biometric identification systems

    Get PDF
    Biometric systems are automatic means for imitating the human brain’s ability of identifying and verifying other humans by their behavioural and physiological characteristics. A system, which uses more than one biometric modality at the same time, is known as a multimodal system. Multimodal biometric systems consolidate the evidence presented by multiple biometric sources and typically provide better recognition performance compared to systems based on a single biometric modality. This thesis addresses some issues related to the implementation of multimodal biometric identity verification systems. The thesis assesses the feasibility of using commercial offthe-shelf products to construct deployable multimodal biometric system. It also identifies multimodal biometric fusion as a challenging optimisation problem when one considers the presence of several configurations and settings, in particular the verification thresholds adopted by each biometric device and the decision fusion algorithm implemented for a particular configuration. The thesis proposes a novel approach for the optimisation of multimodal biometric systems based on the use of genetic algorithms for solving some of the problems associated with the different settings. The proposed optimisation method also addresses some of the problems associated with score normalization. In addition, the thesis presents an analysis of the performance of different fusion rules when characterising the system users as sheep, goats, lambs and wolves. The results presented indicate that the proposed optimisation method can be used to solve the problems associated with threshold settings. This clearly demonstrates a valuable potential strategy that can be used to set a priori thresholds of the different biometric devices before using them. The proposed optimisation architecture addressed the problem of score normalisation, which makes it an effective “plug-and-play” design philosophy to system implementation. The results also indicate that the optimisation approach can be used for effectively determining the weight settings, which is used in many applications for varying the relative importance of the different performance parameters

    Human-AI complex task planning

    Get PDF
    The process of complex task planning is ubiquitous and arises in a variety of compelling applications. A few leading examples include designing a personalized course plan or trip plan, designing music playlists/work sessions in web applications, or even planning routes of naval assets to collaboratively discover an unknown destination. For all of these aforementioned applications, creating a plan requires satisfying a basic construct, i.e., composing a sequence of sub-tasks (or items) that optimizes several criteria and satisfies constraints. For instance, in course planning, sub-tasks or items are core and elective courses, and degree requirements capture their complex dependencies as constraints. In trip planning, sub-tasks are points of interest (POIs) and constraints represent time and monetary budget, or user-specified requirements. Needless to say, task plans are to be individualized and designed considering uncertainty. When done manually, the process is human-intensive and tedious, and unlikely to scale. The goal of this dissertation is to present computational frameworks that synthesize the capabilities of human and AI algorithms to enable task planning at scale while satisfying multiple objectives and complex constraints. This dissertation makes significant contributions in four main areas, (i) proposing novel models, (ii) designing principled scalable algorithms, (iii) conducting rigorous experimental analysis, and (iv) deploying designed solutions in the real-world. A suite of constrained and multi-objective optimization problems has been formalized, with a focus on their applicability across diverse domains. From an algorithmic perspective, the dissertation proposes principled algorithms with theoretical guarantees adapted from discrete optimization techniques, as well as Reinforcement Learning based solutions. The memory and computational efficiency of these algorithms have been studied, and optimization opportunities have been proposed. The designed solutions are extensively evaluated on various large-scale real-world and synthetic datasets and compared against multiple baseline solutions after appropriate adaptation. This dissertation also presents user study results involving human subjects to validate the effectiveness of the proposed models. Lastly, a notable outcome of this dissertation is the deployment of one of the developed solutions at the Naval Postgraduate School. This deployment enables simultaneous route planning for multiple assets that are robust to uncertainty under multiple contexts

    Data-Driven Exploration of Coarse-Grained Equations: Harnessing Machine Learning

    Get PDF
    In scientific research, understanding and modeling physical systems often involves working with complex equations called Partial Differential Equations (PDEs). These equations are essential for describing the relationships between variables and their derivatives, allowing us to analyze a wide range of phenomena, from fluid dynamics to quantum mechanics. Traditionally, the discovery of PDEs relied on mathematical derivations and expert knowledge. However, the advent of data-driven approaches and machine learning (ML) techniques has transformed this process. By harnessing ML techniques and data analysis methods, data-driven approaches have revolutionized the task of uncovering complex equations that describe physical systems. The primary goal in this thesis is to develop methodologies that can automatically extract simplified equations by training models using available data. ML algorithms have the ability to learn underlying patterns and relationships within the data, making it possible to extract simplified equations that capture the essential behavior of the system. This study considers three distinct learning categories: black-box, gray-box, and white-box learning. The initial phase of the research focuses on black-box learning, where no prior information about the equations is available. Three different neural network architectures are explored: multi-layer perceptron (MLP), convolutional neural network (CNN), and a hybrid architecture combining CNN and long short-term memory (CNN-LSTM). These neural networks are applied to uncover the non-linear equations of motion associated with phase-field models, which include both non-conserved and conserved order parameters. The second architecture explored in this study addresses explicit equation discovery in gray-box learning scenarios, where a portion of the equation is unknown. The framework employs eXtended Physics-Informed Neural Networks (X-PINNs) and incorporates domain decomposition in space to uncover a segment of the widely-known Allen-Cahn equation. Specifically, the Laplacian part of the equation is assumed to be known, while the objective is to discover the non-linear component of the equation. Moreover, symbolic regression techniques are applied to deduce the precise mathematical expression for the unknown segment of the equation. Furthermore, the final part of the thesis focuses on white-box learning, aiming to uncover equations that offer a detailed understanding of the studied system. Specifically, a coarse parametric ordinary differential equation (ODE) is introduced to accurately capture the spreading radius behavior of Calcium-magnesium-aluminosilicate (CMAS) droplets. Through the utilization of the Physics-Informed Neural Network (PINN) framework, the parameters of this ODE are determined, facilitating precise estimation. The architecture is employed to discover the unknown parameters of the equation, assuming that all terms of the ODE are known. This approach significantly improves our comprehension of the spreading dynamics associated with CMAS droplets

    Farming out : a study.

    Get PDF
    Farming is one of severals ways of arranging for a group of individuals to perform work simultaneously. Farming is attractive. It is a simple concept, and yet it allocates work dynamically, balancing the load automatically. This gives rise to potentially great efficiency; yet the range of applications that can be farmed efficiently and which implementation strategies are the most effective has not been classified. This research has investigated the types of application, design and implementation that farm efficiently on computer systems constructed from a network of communicating parallel processors. This research shows that all applications can be farmed and identifies those concerns that dictate efficiency. For the first generation of transputer hardware, extensive experiments have been performed using Occam, independent of any specific application. This study identified the boundary conditions that dictate which design parameters farm efficiently. These boundary conditions are expressed in a general form that is directly amenable to other architectures. The specific quantitative results are of direct use to others who wish to implement farms on this architecture. Because of farming’s simplicity and potential for high efficiency, this work concludes that architects of parallel hardware should consider binding this paradigm into future systems so as to enable the dynamic allocation of processes to processors to take place automatically. As well as resulting in high levels of machine utilisation for all programs, this would also permanently remove the burden of allocation from the programmer

    LIPIcs, Volume 261, ICALP 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 261, ICALP 2023, Complete Volum
    • …
    corecore