5 research outputs found

    On Reducing Undesirable Behavior in Deep Reinforcement Learning Models

    Full text link
    Deep reinforcement learning (DRL) has proven extremely useful in a large variety of application domains. However, even successful DRL-based software can exhibit highly undesirable behavior. This is due to DRL training being based on maximizing a reward function, which typically captures general trends but cannot precisely capture, or rule out, certain behaviors of the system. In this paper, we propose a novel framework aimed at drastically reducing the undesirable behavior of DRL-based software, while maintaining its excellent performance. In addition, our framework can assist in providing engineers with a comprehensible characterization of such undesirable behavior. Under the hood, our approach is based on extracting decision tree classifiers from erroneous state-action pairs, and then integrating these trees into the DRL training loop, penalizing the system whenever it performs an error. We provide a proof-of-concept implementation of our approach, and use it to evaluate the technique on three significant case studies. We find that our approach can extend existing frameworks in a straightforward manner, and incurs only a slight overhead in training time. Further, it incurs only a very slight hit to performance, or even in some cases - improves it, while significantly reducing the frequency of undesirable behavior

    Safe Deep Reinforcement Learning: Enhancing the Reliability of Intelligent Systems

    Get PDF
    In the last few years, the impressive success of deep reinforcement learning (DRL) agents in a wide variety of applications has led to the adoption of these systems in safety-critical contexts (e.g., autonomous driving, robotics, and medical applications), where expensive hardware and human safety can be involved. In such contexts, an intelligent learning agent must adhere to certain requirements that go beyond the simple accomplishment of the task and typically include constraints on the agent's behavior. Against this background, this thesis proposes a set of training and validation methodologies that constitute a unified pipeline to generate safe and reliable DRL agents. In the first part of this dissertation, we focus on the problem of constrained DRL, leaving the challenging problem of the formal verification of deep neural networks for the second part of this work. As humans, in our growing process, the help of a mentor is crucial to learn effective strategies to solve a problem while a learning process driven only by a trial-and-error approach usually leads to unsafe and inefficient solutions. Similarly, a pure end-to-end deep reinforcement learning approach often results in suboptimal policies, which typically translates into unpredictable, and thus unreliable, behaviors. Following this intuition, we propose to impose a set of constraints into the DRL loop to guide the training process. These requirements, which typically encode domain expert knowledge, can be seen as suggestions that the agent should follow but is allowed to sometimes ignore if useful to maximize the reward signal. A foundational requirement for our work is finding a proper strategy to define and formally encode these constraints (which we refer to as \textit{rules}). In this thesis, we propose to exploit a formal language inherited from the software engineering community: scenario-based programming (SBP). For the actual training, we rely on the constrained reinforcement learning paradigm, proposing an extended version of the Lagrangian PPO algorithm. Recalling the parallelism with human beings, before being authorized to perform safety-critical operations, we must obtain a certification (e.g., a license to drive a car or a degree to perform medical operations). In the second part of this dissertation, we apply this concept in a deep reinforcement learning context, where the intelligent agents are controlled by artificial neural networks. In particular, we propose to perform a model selection phase after the training to find models that formally respect some given safety requirements before the deployment. However, DNNs have long been considered unpredictable black boxes and thus unsuitable for safety-critical contexts. Against this background, we build upon the emerging field of formal verification for neural networks to extend state-of-the-art approaches to robotic decision-making contexts. We propose ``ProVe", a verification tool for decision-making DNNs that quantifies the probability of violating the specified requirements. In the last chapter of this thesis, we provide a complete case study on a popular robotic problem: ``mapless navigation". Here, we show a concrete example of the application of our pipeline, starting from the definition of the requirements to the training and the final formal verification phase, to finally obtain a provably safe and effective agent

    Teaching programming for novices by designing computer games

    Get PDF
    Programiranje je apstraktno i teško, pogotovo za početnike na osnovnoškolskoj razini školovanja. Za „klasično“ poučavanje programiranja uglavnom se koriste tekstualni programski jezici, što dodatno otežava učenje početnicima zbog naglašenih problema sintakse. Osim toga, dodatni problem je kontekst programiranja koji se uglavnom svodi na rješavanje matematičkih problema, što pripadnicima digitalnog doba smanjuje motivaciju. Ublažavanje problema sintakse, ali i pomak konteksta programiranja sa matematičkih problema na npr. oblikovanje igara može se postići korištenjem vizualnih programskih jezika primjerenim uzrastu. Prema navedenom postavljena su tri osnovna cilja istraživanja: a) utvrditi utjecaj na poučavanje programiranja, kod početnika u osnovnoj školi, oblikovanjem igara u blokovski orijentiranom, vizualnom programskom jeziku, primjerenom dobi učenika; b) utvrditi utjecaj promjene konteksta učenja s matematičkih problema prema oblikovanju računalnih igara na razumijevanje koncepta petlje i motivaciju učenika u osnovnoj školi; c) osmisliti model poučavanja programiranja početnika u osnovnoj školi uz uporabu alata i zadataka primjerenim dobi učenika. Prema ciljevima osmišljeno je istraživanje u četiri faze koje se provodilo tijekom četiri školske godine u osnovnim školama. Rezultati cjelokupnog istraživanja pokazali su kako učenici postižu bolje rezultate kada se za poučavanje programiranja kod početnika u osnovnoj školi koristi kontekst oblikovanja igara te kada se za programiranje koristi vizualni-blokovski programski jezik primjeren dobi. Također je utvrđeno kako je stav prema programiranju bolji u odnosu na „klasično“ poučavanje programiranja. Rezultati istraživanja mogu biti smjernice za razvoj novih načina poučavanja programiranja početnika u osnovnim školama.Programming is abstract and hard, especially for beginners at the primary level of education. The traditional way of teaching programming is based on using text-based programming languages which further complicates learning for beginners by emphasizing syntax problems. Besides that, additional problem is the programming context which is mostly based on solving math problems, reducing the motivation for members of the digital age. Syntax problems can be reduced, but also context can be shifted from solving math problems towards the design of computer games, by using visual programming languages appropriate for target age. Per the above, three main goals were set: a) determine the appropriateness and effectiveness of using visual programming language for designing computer games for programming novices in the elementary school; b) determine the effectiveness of the understanding of repeating algorithm by designing games using visual programming languages; c) offer teaching model for programming novices at the elementary school level by using appropriate tools and tasks. According to set goals, the research is organized in four stages during four school years in elementary schools. The results of the revealed that students gain a better understanding of programming concepts by using visual programming languages appropriate for students age compared to the traditional way of teaching programming. Besides, students motivation toward programming is also improved by programming games using visual programming languages. According to the results, the new didactic strategy that will enrich the methodological knowledge will be proposed

    Teaching programming for novices by designing computer games

    Get PDF
    Programiranje je apstraktno i teško, pogotovo za početnike na osnovnoškolskoj razini školovanja. Za „klasično“ poučavanje programiranja uglavnom se koriste tekstualni programski jezici, što dodatno otežava učenje početnicima zbog naglašenih problema sintakse. Osim toga, dodatni problem je kontekst programiranja koji se uglavnom svodi na rješavanje matematičkih problema, što pripadnicima digitalnog doba smanjuje motivaciju. Ublažavanje problema sintakse, ali i pomak konteksta programiranja sa matematičkih problema na npr. oblikovanje igara može se postići korištenjem vizualnih programskih jezika primjerenim uzrastu. Prema navedenom postavljena su tri osnovna cilja istraživanja: a) utvrditi utjecaj na poučavanje programiranja, kod početnika u osnovnoj školi, oblikovanjem igara u blokovski orijentiranom, vizualnom programskom jeziku, primjerenom dobi učenika; b) utvrditi utjecaj promjene konteksta učenja s matematičkih problema prema oblikovanju računalnih igara na razumijevanje koncepta petlje i motivaciju učenika u osnovnoj školi; c) osmisliti model poučavanja programiranja početnika u osnovnoj školi uz uporabu alata i zadataka primjerenim dobi učenika. Prema ciljevima osmišljeno je istraživanje u četiri faze koje se provodilo tijekom četiri školske godine u osnovnim školama. Rezultati cjelokupnog istraživanja pokazali su kako učenici postižu bolje rezultate kada se za poučavanje programiranja kod početnika u osnovnoj školi koristi kontekst oblikovanja igara te kada se za programiranje koristi vizualni-blokovski programski jezik primjeren dobi. Također je utvrđeno kako je stav prema programiranju bolji u odnosu na „klasično“ poučavanje programiranja. Rezultati istraživanja mogu biti smjernice za razvoj novih načina poučavanja programiranja početnika u osnovnim školama.Programming is abstract and hard, especially for beginners at the primary level of education. The traditional way of teaching programming is based on using text-based programming languages which further complicates learning for beginners by emphasizing syntax problems. Besides that, additional problem is the programming context which is mostly based on solving math problems, reducing the motivation for members of the digital age. Syntax problems can be reduced, but also context can be shifted from solving math problems towards the design of computer games, by using visual programming languages appropriate for target age. Per the above, three main goals were set: a) determine the appropriateness and effectiveness of using visual programming language for designing computer games for programming novices in the elementary school; b) determine the effectiveness of the understanding of repeating algorithm by designing games using visual programming languages; c) offer teaching model for programming novices at the elementary school level by using appropriate tools and tasks. According to set goals, the research is organized in four stages during four school years in elementary schools. The results of the revealed that students gain a better understanding of programming concepts by using visual programming languages appropriate for students age compared to the traditional way of teaching programming. Besides, students motivation toward programming is also improved by programming games using visual programming languages. According to the results, the new didactic strategy that will enrich the methodological knowledge will be proposed
    corecore