463 research outputs found

    Learning to Behave:Reinforcement Learning in Human Contexts

    Get PDF
    Reinforcement learning (RL) has recently attracted significant attention with applications such as improving microchip designs, predicting the behaviour of protein structures and beating humanity’s best in the games of go, chess and Starcraft-II. These impressive and inspring successes show how RL can improve our lives, however, they have so far been seen mostly in settings that involve humans to a very limited extend. This thesis looks into the usage of RL in human contexts. First, we provide a systematic literature review of the usage of RL for personalisation, i.e. the adaptation of systems to individuals. Next, we show how RL can be used to personalise a conversational recommender system and find that it outperforms existing approaches, including a gold-standard and task-specific solutions in a simulation-based study. Since simulators may not be available for all conversational systems that could benefit from personalisation, we next look into the collection of user satisfaction ratings for dialogue data. We consolidate best practices in a UI for user satisfaction annotation and show that high-quality ratings can be obtained. Next, we look into the usage of RL for strategic workforce planning. Here, we find that RL is robust to the uncertainties that are an inherent part of this problem and that RL enables the specification of goals intuitive to domain experts. Having looked into these use-cases, we then turn toward the inclusion of safety constraints in RL. We propose how safety constraints from a medical guideline can be taken into account in an observational study on the optimisation of ventilator settings for ICU patients. Next, we look into safety constraints that contain a temporal component, we find that these may make the learning problem infeasible and propose a solution based on reward shaping to address this issue. Finally, we propose how RL can benefit from instructions that break a full task into smaller pieces based on the option framework and propose an approach for learning reusable behaviours from instructions to greatly reduce data requirements

    Collaborative Modeling of Processes: What Facilitation Support Does a Group Need?

    Get PDF
    Collaborative modeling of processes is increasingly being used in practice. However, collaborative modeling is difficult. To overcome the difficulties, a professional facilitator can be used. Collaboration Engineering takes up the challenge to design collaboration processes that do not need a professional facilitator, but can be facilitated by practitioners. This research contributes to this by identifying what facilitation aspects are important in collaborative modeling and which of these aspects can be transferred to practitioners. Three facilitation aspects are considered important: (1) guarding the rules of the modeling technique, (2) checking for completeness and (3) translating elements in reality to modeling concepts. The first facilitation aspect can be taken over by a tool that controls the rules of the modeling technique. The second facilitation aspect most likely can be taken over by the practitioner, but for the third aspect a professional with modeling expertise is require

    Low-Variance Policy Gradient Estimation with World Models

    Get PDF
    In this paper, we propose World Model Policy Gradient (WMPG), an approach to reduce the variance of policy gradient estimates using learned world models (WM's). In WMPG, a WM is trained online and used to imagine trajectories. The imagined trajectories are used in two ways. Firstly, to calculate a without-replacement estimator of the policy gradient. Secondly, the return of the imagined trajectories is used as an informed baseline. We compare the proposed approach with AC and MAC on a set of environments of increasing complexity (CartPole, LunarLander and Pong) and find that WMPG has better sample efficiency. Based on these results, we conclude that WMPG can yield increased sample efficiency in cases where a robust latent representation of the environment can be learned

    Explainable Fraud Detection with Deep Symbolic Classification

    Full text link
    There is a growing demand for explainable, transparent, and data-driven models within the domain of fraud detection. Decisions made by fraud detection models need to be explainable in the event of a customer dispute. Additionally, the decision-making process in the model must be transparent to win the trust of regulators and business stakeholders. At the same time, fraud detection solutions can benefit from data due to the noisy, dynamic nature of fraud and the availability of large historical data sets. Finally, fraud detection is notorious for its class imbalance: there are typically several orders of magnitude more legitimate transactions than fraudulent ones. In this paper, we present Deep Symbolic Classification (DSC), an extension of the Deep Symbolic Regression framework to classification problems. DSC casts classification as a search problem in the space of all analytic functions composed of a vocabulary of variables, constants, and operations and optimizes for an arbitrary evaluation metric directly. The search is guided by a deep neural network trained with reinforcement learning. Because the functions are mathematical expressions that are in closed-form and concise, the model is inherently explainable both at the level of a single classification decision and the model's decision process. Furthermore, the class imbalance problem is successfully addressed by optimizing for metrics that are robust to class imbalance such as the F1 score. This eliminates the need for oversampling and undersampling techniques that plague traditional approaches. Finally, the model allows to explicitly balance between the prediction accuracy and the explainability. An evaluation on the PaySim data set demonstrates competitive predictive performance with state-of-the-art models, while surpassing them in terms of explainability. This establishes DSC as a promising model for fraud detection systems.Comment: 12 pages, 3 figures, To be published in the 3rd International Workshop on Explainable AI in Finance of the 4th ACM International Conference on AI in Finance (ICAIF, https://ai-finance.org/

    A Repeatable Collaboration Process for Developing a Road Map for Emerging New Technology Business: Case Mobile Marketing

    Get PDF
    The unique and little practiced characteristics of mobile as a marketing medium create a need to set up an action and research agenda regularly to foster the development of the mobile marketing value system. Numerous stakeholders take part in the network to deliver the mobile services. Strengthening their inter-organizational relationships is critical for the emerging value system to evolve. Our paper employs Collaboration Engineering to address this undertaking by designing a standard process that actors in mobile marketing, as well as in other emerging new technology businesses, can use to collaboratively develop a road map for the future. The first field test of this process was conducted in London in connection with the Mobile Marketing Summit ’04 organized by Nokia. The results are promising. Together with senior management of 25 leading brand marketers and advertising agencies we were able to outline an extensive road map while strengthening the network formation in the field

    Log Parsing Evaluation in the Era of Modern Software Systems

    Full text link
    Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing

    Log Parsing Evaluation in the Era of Modern Software Systems

    Get PDF
    Due to the complexity and size of modern software systems, the amount of logs generated is tremendous. Hence, it is infeasible to manually investigate these data in a reasonable time, thereby requiring automating log analysis to derive insights about the functioning of the systems. Motivated by an industry use-case, we zoom-in on one integral part of automated log analysis, log parsing, which is the prerequisite to deriving any insights from logs. Our investigation reveals problematic aspects within the log parsing field, particularly its inefficiency in handling heterogeneous real-world logs. We show this by assessing the 14 most-recognized log parsing approaches in the literature using (i) nine publicly available datasets, (ii) one dataset comprised of combined publicly available data, and (iii) one dataset generated within the infrastructure of a large bank. Subsequently, toward improving log parsing robustness in real-world production scenarios, we propose a tool, Logchimera, that enables estimating log parsing performance in industry contexts through generating synthetic log data that resemble industry logs. Our contributions serve as a foundation to consolidate past research efforts, facilitate future research advancements, and establish a strong link between research and industry log parsing
    • …
    corecore