96 research outputs found

    An expert system methodology for SMEs and NPOs

    Get PDF
    Traditionally Expert Systems (ES) require a full analysis of the business problem by a Knowledge Engineer (KE) to develop a solution. This inherently makes ES technology very expensive and beyond the affordability of the majority of Small and Medium sized Enterprises (SMEs) and Non-Profit Organisations (NPOs). Therefore, SMEs and NPOs tend to only have access to off-the-shelf solutions to generic problems, which rarely meet the full extent of an organisation’s requirements. One existing methodological stream of research, Ripple-Down Rules (RDR) goes some of the way to being suitable to SMEs and NPOs as it removes the need for a knowledge engineer. This group of methodologies provide an environment where a company can develop large knowledge based systems themselves, specifically tailored to the company’s individual situation. These methods, however, require constant supervision by the expert during development, which is still a significant burden on the organisation. This paper discusses an extension to an RDR method, known as Rated MCRDR (RM) and a feature called prudence analysis. This enhanced methodology to ES development is particularly well suited to the development of ES in restricted environments such as SMEs and NPOs

    An expert system methodology for SMEs and NPOs

    Get PDF
    Traditionally Expert Systems (ES) require a full analysis of the business problem by a Knowledge Engineer (KE) to develop a solution. This inherently makes ES technology very expensive and beyond the affordability of the majority of Small and Medium sized Enterprises (SMEs) and Non-Profit Organisations (NPOs). Therefore, SMEs and NPOs tend to only have access to off-the-shelf solutions to generic problems, which rarely meet the full extent of an organisation’s requirements. One existing methodological stream of research, Ripple-Down Rules (RDR) goes some of the way to being suitable to SMEs and NPOs as it removes the need for a knowledge engineer. This group of methodologies provide an environment where a company can develop large knowledge based systems themselves, specifically tailored to the company’s individual situation. These methods, however, require constant supervision by the expert during development, which is still a significant burden on the organisation. This paper discusses an extension to an RDR method, known as Rated MCRDR (RM) and a feature called prudence analysis. This enhanced methodology to ES development is particularly well suited to the development of ES in restricted environments such as SMEs and NPOs

    The impact of environmental stochasticity on value-based multiobjective reinforcement learning

    Get PDF
    A common approach to address multiobjective problems using reinforcement learning methods is to extend model-free, value-based algorithms such as Q-learning to use a vector of Q-values in combination with an appropriate action selection mechanism that is often based on scalarisation. Most prior empirical evaluation of these approaches has focused on deterministic environments. This study examines the impact on stochasticity in rewards and state transitions on the behaviour of multi-objective Q-learning. It shows that the nature of the optimal solution depends on these environmental characteristics, and also on whether we desire to maximise the Expected Scalarised Return (ESR) or the Scalarised Expected Return (SER). We also identify a novel aim which may arise in some applications of maximising SER subject to satisfying constraints on the variation in return and show that this may require different solutions than ESR or conventional SER. The analysis of the interaction between environmental stochasticity and multi-objective Q-learning is supported by empirical evaluations on several simple multiobjective Markov Decision Processes with varying characteristics. This includes a demonstration of a novel approach to learning deterministic SER-optimal policies for environments with stochastic rewards. In addition, we report a previously unidentified issue with model-free, value-based approaches to multiobjective reinforcement learning in the context of environments with stochastic state transitions. Having highlighted the limitations of value-based model-free MORL methods, we discuss several alternative methods that may be more suitable for maximising SER in MOMDPs with stochastic transitions. © 2021, The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature

    Explainable reinforcement learning for broad-XAI: a conceptual framework and survey

    Get PDF
    Broad-XAI moves away from interpreting individual decisions based on a single datum and aims to provide integrated explanations from multiple machine learning algorithms into a coherent explanation of an agent’s behaviour that is aligned to the communication needs of the explainee. Reinforcement Learning (RL) methods, we propose, provide a potential backbone for the cognitive model required for the development of Broad-XAI. RL represents a suite of approaches that have had increasing success in solving a range of sequential decision-making problems. However, these algorithms operate as black-box problem solvers, where they obfuscate their decision-making policy through a complex array of values and functions. EXplainable RL (XRL) aims to develop techniques to extract concepts from the agent’s: perception of the environment; intrinsic/extrinsic motivations/beliefs; Q-values, goals and objectives. This paper aims to introduce the Causal XRL Framework (CXF), that unifies the current XRL research and uses RL as a backbone to the development of Broad-XAI. CXF is designed to incorporate many standard RL extensions and integrated with external ontologies and communication facilities so that the agent can answer questions that explain outcomes its decisions. This paper aims to: establish XRL as a distinct branch of XAI; introduce a conceptual framework for XRL; review existing approaches explaining agent behaviour; and identify opportunities for future research. Finally, this paper discusses how additional information can be extracted and ultimately integrated into models of communication, facilitating the development of Broad-XAI. © 2023, The Author(s)

    Softmax exploration strategies for multiobjective reinforcement learning

    Get PDF
    Despite growing interest over recent years in applying reinforcement learning to multiobjective problems, there has been little research into the applicability and effectiveness of exploration strategies within the multiobjective context. This work considers several widely-used approaches to exploration from the single-objective reinforcement learning literature, and examines their incorporation into multiobjective Q-learning. In particular this paper proposes two novel approaches which extend the softmax operator to work with vector-valued rewards. The performance of these exploration strategies is evaluated across a set of benchmark environments. Issues arising from the multiobjective formulation of these benchmarks which impact on the performance of the exploration strategies are identified. It is shown that of the techniques considered, the combination of the novel softmax–epsilon exploration with optimistic initialisation provides the most effective trade-off between exploration and exploitation

    Language representations for generalization in reinforcement learning

    Get PDF
    The choice of state and action representation in Reinforcement Learning (RL) has a significant effect on agent performance for the training task. But its relationship with generalization to new tasks is under-explored. One approach to improving generalization investigated here is the use of language as a representation. We compare vector-states and discreteactions to language representations. We find the agents using language representations generalize better and could solve tasks with more entities, new entities, and more complexity than seen in the training task. We attribute this to the compositionality of languag

    Generalising symbolic knowledge in online classification and prediction

    Get PDF
    Increasingly, researchers and developers of knowledge based systems (KBS) have been incorporating the notion of context. For instance, Repertory Grids, Formal Concept Analysis (FCA) and Ripple-Down Rules (RDR) all integrate either implicit or explicit contextual information. However, these methodologies treat context as a static entity, neglecting many connectionists' work in learning hidden and dynamic contexts, which aid their ability to generalize. This paper presents a method that models hidden context within a symbolic domain in order to achieve a level of generalisation. The method developed builds on the already established Multiple Classification Ripple-Down Rules (MCRDR) approach and is referred to as Rated MCRDR (RM). RM retains a symbolic core, while using a connection based approach to learn a deeper understanding of the captured knowledge. This method is applied to a number of classification and prediction environments and results indicate that the method can learn the information that experts have difficulty providing. © Springer-Verlag Berlin Heidelberg 2009

    Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning

    Full text link
    Reinforcement learning is an approach used by intelligent agents to autonomously learn new skills. Although reinforcement learning has been demonstrated to be an effective learning approach in several different contexts, a common drawback exhibited is the time needed in order to satisfactorily learn a task, especially in large state-action spaces. To address this issue, interactive reinforcement learning proposes the use of externally-sourced information in order to speed up the learning process. Up to now, different information sources have been used to give advice to the learner agent, among them human-sourced advice. When interacting with a learner agent, humans may provide either evaluative or informative advice. From the agent's perspective these styles of interaction are commonly referred to as reward-shaping and policy-shaping respectively. Evaluation requires the human to provide feedback on the prior action performed, while informative advice they provide advice on the best action to select for a given situation. Prior research has focused on the effect of human-sourced advice on the interactive reinforcement learning process, specifically aiming to improve the learning speed of the agent, while reducing the engagement with the human. This work presents an experimental setup for a human-trial designed to compare the methods people use to deliver advice in term of human engagement. Obtained results show that users giving informative advice to the learner agents provide more accurate advice, are willing to assist the learner agent for a longer time, and provide more advice per episode. Additionally, self-evaluation from participants using the informative approach has indicated that the agent's ability to follow the advice is higher, and therefore, they feel their own advice to be of higher accuracy when compared to people providing evaluative advice.Comment: 33 pages, 15 figure

    Prediction using a symbolic based hybrid system

    Get PDF
    Knowledge Based Systems (KBS) are highly successful in classification and diagnostics situations; however, they are generally unable to identify specific values for prediction problems. When used for prediction they either use some form of uncertainty reasoning or use a classification style inference where each class is a discrete predictive value instead. This paper applies a hybrid algorithm that allows an expert’s knowledge to be adapted to provide continuous values to solve prediction problems. The method applied to prediction in this paper is built on the already established Multiple Classification Ripple-Down Rules (MCRDR) approach and is referred to as Rated MCRDR (RM). The method is published in a parallel paper in this workshop titled Generalisation with Symbolic Knowledge in Online Classification. Results indicate a strong propensity to quickly adapt and provide accurate predictions
    • …
    corecore