2,345 research outputs found

    A continuous information gain measure to find the most discriminatory problems for AI benchmarking

    Get PDF
    This paper introduces an information-theoretic method for selecting a subset of problems which gives the most information about a group of problem-solving algorithms. This method was tested on the games in the General Video Game AI (GVGAI) framework, allowing us to identify a smaller set of games that still gives a large amount of information about the abilities of different game-playing agents. This approach can be used to make agent testing more efficient. We can achieve almost as good discriminatory accuracy when testing on only a handful of games as when testing on more than a hundred games, something which is often computationally infeasible. Furthermore, this method can be extended to study the dimensions of the effective variance in game design between these games, allowing us to identify which games differentiate between agents in the most complementary ways

    A Performance-Explainability-Fairness Framework For Benchmarking ML Models

    Get PDF
    Machine learning (ML) models have achieved remarkable success in various applications; however, ensuring their robustness and fairness remains a critical challenge. In this research, we present a comprehensive framework designed to evaluate and benchmark ML models through the lenses of performance, explainability, and fairness. This framework addresses the increasing need for a holistic assessment of ML models, considering not only their predictive power but also their interpretability and equitable deployment. The proposed framework leverages a multi-faceted evaluation approach, integrating performance metrics with explainability and fairness assessments. Performance evaluation incorporates standard measures such as accuracy, precision, and recall, but extends to overall balanced error rate, overall area under the receiver operating characteristic (ROC) curve (AUC), to capture model behavior across different performance aspects. Explainability assessment employs state-of-the-art techniques to quantify the interpretability of model decisions, ensuring that model behavior can be understood and trusted by stakeholders. The fairness evaluation examines model predictions in terms of demographic parity, equalized odds, thereby addressing concerns of bias and discrimination in the deployment of ML systems. To demonstrate the practical utility of the framework, we apply it to a diverse set of ML algorithms across various functional domains, including finance, criminology, education, and healthcare prediction. The results showcase the importance of a balanced evaluation approach, revealing trade-offs between performance, explainability, and fairness that can inform model selection and deployment decisions. Furthermore, we provide insights into the analysis of tradeoffs in selecting the appropriate model for use cases where performance, interpretability and fairness are important. In summary, the Performance-Explainability-Fairness Framework offers a unified methodology for evaluating and benchmarking ML models, enabling practitioners and researchers to make informed decisions about model suitability and ensuring responsible and equitable AI deployment. We believe that this framework represents a crucial step towards building trustworthy and accountable ML systems in an era where AI plays an increasingly prominent role in decision-making processes

    Ensemble decision systems for general video game playing

    Get PDF
    Ensemble Decision Systems offer a unique form of decision making that allows a collection of algorithms to reason together about a problem. Each individual algorithm has its own inherent strengths and weaknesses, and often it is difficult to overcome the weaknesses while retaining the strengths. Instead of altering the properties of the algorithm, the Ensemble Decision System augments the performance with other algorithms that have complementing strengths. This work outlines different options for building an Ensemble Decision System as well as providing analysis on its performance compared to the individual components of the system with interesting results, showing an increase in the generality of the algorithms without significantly impeding performance.Comment: 8 Pages, Accepted at COG201

    Outcomes for youth work : coming of age or master’s bidding?

    Get PDF
    Abstract Providing evidence in youth work is a current and important debate. Modern youth work has, at least to some degree, recognised the need to produce practice information, through its various guises, with limited success as requirements and terminology have continually changed. In Scotland, the current demands for youth work to “prove” itself are through a performance management system that promotes outcome-based practice. There are some difficulties with this position because outcome-based practice lacks methodological rigour, is aligned with national governmental commitments and does not adequately capture the impact of youth work practice. This paper argues that youth workers need to develop both a theoretical and methodological approach to data collection and management,which is in keeping with practice values, captures the voice of the young person and enhances youth work practice. Youth work should not be used as a mechanism to deliver the government’s policies but be liberated from centralist control to become a “free practice” so that some of the perennial problems, such as democratic disillusionment, partly caused by this “performance management industry”, can be effectively dealt with. The generation of evidence for youth work should enable it to freely investigate and capture its impact, within the practice, based on the learning that has taken place, the articulation of the learners’ voice with the most appropriate form of data presentation

    Investigating Trade-offs For Fair Machine Learning Systems

    Get PDF
    Fairness in software systems aims to provide algorithms that operate in a nondiscriminatory manner, with respect to protected attributes such as gender, race, or age. Ensuring fairness is a crucial non-functional property of data-driven Machine Learning systems. Several approaches (i.e., bias mitigation methods) have been proposed in the literature to reduce bias of Machine Learning systems. However, this often comes hand in hand with performance deterioration. Therefore, this thesis addresses trade-offs that practitioners face when debiasing Machine Learning systems. At first, we perform a literature review to investigate the current state of the art for debiasing Machine Learning systems. This includes an overview of existing debiasing techniques and how they are evaluated (e.g., how is bias measured). As a second contribution, we propose a benchmarking approach that allows for an evaluation and comparison of bias mitigation methods and their trade-offs (i.e., how much performance is sacrificed for improving fairness). Afterwards, we propose a debiasing method ourselves, which modifies already trained Machine Learning models, with the goal to improve both, their fairness and accuracy. Moreover, this thesis addresses the challenge of how to deal with fairness with regards to age. This question is answered with an empirical evaluation on real-world datasets

    Dissecting Deep Language Models: The Explainability and Bias Perspective

    Get PDF
    L'abstract Ăš presente nell'allegato / the abstract is in the attachmen

    Handbook for SDG-Aligned Food Companies: Four Pillar Framework Standards

    Get PDF
    The world food system is in crisis. Outright hunger, unhealthy diets and malnutrition occur parallel to food losses and waste. Farming families in poor countries suffer from extreme poverty. And food production is environmentally unsustainable and increasingly vulnerable to extreme weather events caused by climate change. A historic change of direction is needed to bring about a new era of food system sustainability. Our work aims to help companies, investors and other stakeholders move towards a more sustainable food system that is aligned with the Sustainable Development Goals. Transforming the world food system to achieve sustainability in all its dimensions is a major challenge. Achieving the Sustainable Development Goals will require managing major changes to the global food system responsibly, involving hundreds of millions of farmers and their families, global supply chains, thousands of food producing companies, diverse food production systems and local ecologies, food processing and a great diversity of food traditions and cultures. Food companies are engaged in food production, trade, processing, and consumer sales around the world. While they have distinct roles “from farm to fork,” they all share the same responsibility: to be part of the global transformation towards food system sustainability. For more on CCSI and SDSN’s work on corporate alignment with the Sustainable Development Goals, see our framework defining SDG-aligned business practices in the energy sector

    Nature of the learning algorithms for feedforward neural networks

    Get PDF
    The neural network model (NN) comprised of relatively simple computing elements, operating in parallel, offers an attractive and versatile framework for exploring a variety of learning structures and processes for intelligent systems. Due to the amount of research developed in the area many types of networks have been defined. The one of interest here is the multi-layer perceptron as it is one of the simplest and it is considered a powerful representation tool whose complete potential has not been adequately exploited and whose limitations need yet to be specified in a formal and coherent framework. This dissertation addresses the theory of generalisation performance and architecture selection for the multi-layer perceptron; a subsidiary aim is to compare and integrate this model with existing data analysis techniques and exploit its potential by combining it with certain constructs from computational geometry creating a reliable, coherent network design process which conforms to the characteristics of a generative learning algorithm, ie. one including mechanisms for manipulating the connections and/or units that comprise the architecture in addition to the procedure for updating the weights of the connections. This means that it is unnecessary to provide an initial network as input to the complete training process.After discussing in general terms the motivation for this study, the multi-layer perceptron model is introduced and reviewed, along with the relevant supervised training algorithm, ie. backpropagation. More particularly, it is argued that a network developed employing this model can in general be trained and designed in a much better way by extracting more information about the domains of interest through the application of certain geometric constructs in a preprocessing stage, specifically by generating the Voronoi Diagram and Delaunav Triangulation [Okabe et al. 92] of the set of points comprising the training set and once a final architecture which performs appropriately on it has been obtained, Principal Component Analysis [Jolliffe 86] is applied to the outputs produced by the units in the network's hidden layer to eliminate the redundant dimensions of this space
    • 

    corecore