335 research outputs found

    ViSpec: A graphical tool for elicitation of MTL requirements

    Full text link
    One of the main barriers preventing widespread use of formal methods is the elicitation of formal specifications. Formal specifications facilitate the testing and verification process for safety critical robotic systems. However, handling the intricacies of formal languages is difficult and requires a high level of expertise in formal logics that many system developers do not have. In this work, we present a graphical tool designed for the development and visualization of formal specifications by people that do not have training in formal logic. The tool enables users to develop specifications using a graphical formalism which is then automatically translated to Metric Temporal Logic (MTL). In order to evaluate the effectiveness of our tool, we have also designed and conducted a usability study with cohorts from the academic student community and industry. Our results indicate that both groups were able to define formal requirements with high levels of accuracy. Finally, we present applications of our tool for defining specifications for operation of robotic surgery and autonomous quadcopter safe operation.Comment: Technical report for the paper to be published in the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems held in Hamburg, Germany. Includes 10 pages and 19 figure

    Formal Verification of Robotic Contact Tasks via Reachability Analysis

    Full text link
    Verifying the correct behavior of robots in contact tasks is challenging due to model uncertainties associated with contacts. Standard methods for testing often fall short since all (uncountable many) solutions cannot be obtained. Instead, we propose to formally and efficiently verify robot behaviors in contact tasks using reachability analysis, which enables checking all the reachable states against user-provided specifications. To this end, we extend the state of the art in reachability analysis for hybrid (mixed discrete and continuous) dynamics subject to discrete-time input trajectories. In particular, we present a novel and scalable guard intersection approach to reliably compute the complex behavior caused by contacts. We model robots subject to contacts as hybrid automata in which crucial time delays are included. The usefulness of our approach is demonstrated by verifying safe human-robot interaction in the presence of constrained collisions, which was out of reach for existing methods.Comment: This work has been accepted by the 22nd IFAC World Congress (2023 in Yokohama, Japan

    Logic programming for deliberative robotic task planning

    Get PDF
    Over the last decade, the use of robots in production and daily life has increased. With increasingly complex tasks and interaction in different environments including humans, robots are required a higher level of autonomy for efficient deliberation. Task planning is a key element of deliberation. It combines elementary operations into a structured plan to satisfy a prescribed goal, given specifications on the robot and the environment. In this manuscript, we present a survey on recent advances in the application of logic programming to the problem of task planning. Logic programming offers several advantages compared to other approaches, including greater expressivity and interpretability which may aid in the development of safe and reliable robots. We analyze different planners and their suitability for specific robotic applications, based on expressivity in domain representation, computational efficiency and software implementation. In this way, we support the robotic designer in choosing the best tool for his application

    Sampling-Based Approximation Algorithms for Reachability Analysis with Provable Guarantees

    Get PDF
    The successful deployment of many autonomous systems in part hinges on providing rigorous guarantees on their performance and safety through a formal verification method, such as reachability analysis. In this work, we present a simple-to-implement, sampling-based algorithm for reachability analysis that is provably optimal up to any desired approximation accuracy. Our method achieves computational efficiency by judiciously sampling a finite subset of the state space and generating an approximate reachable set by conducting reachability analysis on this finite set of states. We prove that the reachable set generated by our algorithm approximates the ground-truth reachable set for any user-specified approximation accuracy. As a corollary to our main method, we introduce an asymptoticallyoptimal, anytime algorithm for reachability analysis. We present simulation results that reaffirm the theoretical properties of our algorithm and demonstrate its effectiveness in real-world inspired scenariosNational Science Foundation (U.S.

    A Vision of Collaborative Verification-Driven Engineering of Hybrid Systems

    Get PDF
    Abstract. Hybrid systems with both discrete and continuous dynamics are an important model for real-world physical systems. The key challenge is how to ensure their correct functioning w.r.t. safety requirements. Promising techniques to ensure safety seem to be model-driven engineering to develop hybrid systems in a well-defined and traceable manner, and formal verification to prove their correctness. Their combination forms the vision of verification-driven engineering. Despite the remarkable progress in automating formal verification of hybrid systems, the construction of proofs of complex systems often requires significant human guidance, since hybrid systems verification tools solve undecidable problems. It is thus not uncommon for verification teams to consist of many players with diverse expertise. This paper introduces a verification-driven engineering toolset that extends our previous work on hybrid and arithmetic verification with tools for (i) modeling hybrid systems, (ii) exchanging and comparing models and proofs, and (iii) managing verification tasks. This toolset makes it easier to tackle large-scale verification tasks.

    Enhancing Exploration and Safety in Deep Reinforcement Learning

    Get PDF
    A Deep Reinforcement Learning (DRL) agent tries to learn a policy maximizing a long-term objective by trials and errors in large state spaces. However, this learning paradigm requires a non-trivial amount of interactions in the environment to achieve good performance. Moreover, critical applications, such as robotics, typically involve safety criteria to consider while designing novel DRL solutions. Hence, devising safe learning approaches with efficient exploration is crucial to avoid getting stuck in local optima, failing to learn properly, or causing damages to the surrounding environment. This thesis focuses on developing Deep Reinforcement Learning algorithms to foster efficient exploration and safer behaviors in simulation and real domains of interest, ranging from robotics to multi-agent systems. To this end, we rely both on standard benchmarks, such as SafetyGym, and robotic tasks widely adopted in the literature (e.g., manipulation, navigation). This variety of problems is crucial to assess the statistical significance of our empirical studies and the generalization skills of our approaches. We initially benchmark the sample efficiency versus performance trade-off between value-based and policy-gradient algorithms. This part highlights the benefits of using non-standard simulation environments (i.e., Unity), which also facilitates the development of further optimization for DRL. We also discuss the limitations of standard evaluation metrics (e.g., return) in characterizing the actual behaviors of a policy, proposing the use of Formal Verification (FV) as a practical methodology to evaluate behaviors over desired specifications. The second part introduces Evolutionary Algorithms (EAs) as a gradient-free complimentary optimization strategy. In detail, we combine population-based and gradient-based DRL to diversify exploration and improve performance both in single and multi-agent applications. For the latter, we discuss how prior Multi-Agent (Deep) Reinforcement Learning (MARL) approaches hinder exploration, proposing an architecture that favors cooperation without affecting exploration

    Deep Reinforcement Learning in Surgical Robotics: Enhancing the Automation Level

    Full text link
    Surgical robotics is a rapidly evolving field that is transforming the landscape of surgeries. Surgical robots have been shown to enhance precision, minimize invasiveness, and alleviate surgeon fatigue. One promising area of research in surgical robotics is the use of reinforcement learning to enhance the automation level. Reinforcement learning is a type of machine learning that involves training an agent to make decisions based on rewards and punishments. This literature review aims to comprehensively analyze existing research on reinforcement learning in surgical robotics. The review identified various applications of reinforcement learning in surgical robotics, including pre-operative, intra-body, and percutaneous procedures, listed the typical studies, and compared their methodologies and results. The findings show that reinforcement learning has great potential to improve the autonomy of surgical robots. Reinforcement learning can teach robots to perform complex surgical tasks, such as suturing and tissue manipulation. It can also improve the accuracy and precision of surgical robots, making them more effective at performing surgeries

    Safe Deep Reinforcement Learning: Enhancing the Reliability of Intelligent Systems

    Get PDF
    In the last few years, the impressive success of deep reinforcement learning (DRL) agents in a wide variety of applications has led to the adoption of these systems in safety-critical contexts (e.g., autonomous driving, robotics, and medical applications), where expensive hardware and human safety can be involved. In such contexts, an intelligent learning agent must adhere to certain requirements that go beyond the simple accomplishment of the task and typically include constraints on the agent's behavior. Against this background, this thesis proposes a set of training and validation methodologies that constitute a unified pipeline to generate safe and reliable DRL agents. In the first part of this dissertation, we focus on the problem of constrained DRL, leaving the challenging problem of the formal verification of deep neural networks for the second part of this work. As humans, in our growing process, the help of a mentor is crucial to learn effective strategies to solve a problem while a learning process driven only by a trial-and-error approach usually leads to unsafe and inefficient solutions. Similarly, a pure end-to-end deep reinforcement learning approach often results in suboptimal policies, which typically translates into unpredictable, and thus unreliable, behaviors. Following this intuition, we propose to impose a set of constraints into the DRL loop to guide the training process. These requirements, which typically encode domain expert knowledge, can be seen as suggestions that the agent should follow but is allowed to sometimes ignore if useful to maximize the reward signal. A foundational requirement for our work is finding a proper strategy to define and formally encode these constraints (which we refer to as \textit{rules}). In this thesis, we propose to exploit a formal language inherited from the software engineering community: scenario-based programming (SBP). For the actual training, we rely on the constrained reinforcement learning paradigm, proposing an extended version of the Lagrangian PPO algorithm. Recalling the parallelism with human beings, before being authorized to perform safety-critical operations, we must obtain a certification (e.g., a license to drive a car or a degree to perform medical operations). In the second part of this dissertation, we apply this concept in a deep reinforcement learning context, where the intelligent agents are controlled by artificial neural networks. In particular, we propose to perform a model selection phase after the training to find models that formally respect some given safety requirements before the deployment. However, DNNs have long been considered unpredictable black boxes and thus unsuitable for safety-critical contexts. Against this background, we build upon the emerging field of formal verification for neural networks to extend state-of-the-art approaches to robotic decision-making contexts. We propose ``ProVe", a verification tool for decision-making DNNs that quantifies the probability of violating the specified requirements. In the last chapter of this thesis, we provide a complete case study on a popular robotic problem: ``mapless navigation". Here, we show a concrete example of the application of our pipeline, starting from the definition of the requirements to the training and the final formal verification phase, to finally obtain a provably safe and effective agent

    Interpretable task planning and learning for autonomous robotic surgery with logic programming

    Get PDF
    This thesis addresses the long-term goal of full (supervised) autonomy in surgery, characterized by dynamic environmental (anatomical) conditions, unpredictable workflow of execution and workspace constraints. The scope is to reach autonomy at the level of sub-tasks of a surgical procedure, i.e. repetitive, yet tedious operations (e.g., dexterous manipulation of small objects in a constrained environment, as needle and wire for suturing). This will help reducing time of execution, hospital costs and fatigue of surgeons during the whole procedure, while further improving the recovery time for the patients. A novel framework for autonomous surgical task execution is presented in the first part of this thesis, based on answer set programming (ASP), a logic programming paradigm, for task planning (i.e., coordination of elementary actions and motions). Logic programming allows to directly encode surgical task knowledge, representing emph{plan reasoning methodology} rather than a set of pre-defined plans. This solution introduces several key advantages, as reliable human-like interpretable plan generation, real-time monitoring of the environment and the workflow for ready adaptation and failure recovery. Moreover, an extended review of logic programming for robotics is presented, motivating the choice of ASP for surgery and providing an useful guide for robotic designers. In the second part of the thesis, a novel framework based on inductive logic programming (ILP) is presented for surgical task knowledge learning and refinement. ILP guarantees fast learning from very few examples, a common drawback of surgery. Also, a novel action identification algorithm is proposed based on automatic environmental feature extraction from videos, dealing for the first time with small and noisy datasets collecting different workflows of executions under environmental variations. This allows to define a systematic methodology for unsupervised ILP. All the results in this thesis are validated on a non-standard version of the benchmark training ring transfer task for surgeons, which mimics some of the challenges of real surgery, e.g. constrained bimanual motion in small space

    ํ™•๋ฅ ์  ์•ˆ์ „์„ฑ ๊ฒ€์ฆ์„ ์œ„ํ•œ ์•ˆ์ „ ๊ฐ•ํ™”ํ•™์Šต: ๋žดํ‘ธ๋…ธ๋ธŒ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ•๋ก 

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2020. 8. ์–‘์ธ์ˆœ.Emerging applications in robotic and autonomous systems, such as autonomous driving and robotic surgery, often involve critical safety constraints that must be satisfied even when information about system models is limited. In this regard, we propose a model-free safety specification method that learns the maximal probability of safe operation by carefully combining probabilistic reachability analysis and safe reinforcement learning (RL). Our approach constructs a Lyapunov function with respect to a safe policy to restrain each policy improvement stage. As a result, it yields a sequence of safe policies that determine the range of safe operation, called the safe set, which monotonically expands and gradually converges. We also develop an efficient safe exploration scheme that accelerates the process of identifying the safety of unexamined states. Exploiting the Lyapunov shieding, our method regulates the exploratory policy to avoid dangerous states with high confidence. To handle high-dimensional systems, we further extend our approach to deep RL by introducing a Lagrangian relaxation technique to establish a tractable actor-critic algorithm. The empirical performance of our method is demonstrated through continuous control benchmark problems, such as a reaching task on a planar robot arm.์ž์œจ์ฃผํ–‰, ๋กœ๋ด‡ ์ˆ˜์ˆ  ๋“ฑ ์ž์œจ์‹œ์Šคํ…œ ๋ฐ ๋กœ๋ณดํ‹ฑ์Šค์˜ ๋– ์˜ค๋ฅด๋Š” ์‘์šฉ ๋ถ„์•ผ์˜ ์ ˆ๋Œ€ ๋‹ค์ˆ˜๋Š” ์•ˆ์ „ํ•œ ๋™์ž‘์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด ์ผ์ •ํ•œ ์ œ์•ฝ์„ ํ•„์š”๋กœ ํ•œ๋‹ค. ํŠนํžˆ, ์•ˆ์ „์ œ์•ฝ์€ ์‹œ์Šคํ…œ ๋ชจ๋ธ์— ๋Œ€ํ•ด ์ œํ•œ๋œ ์ •๋ณด๋งŒ ์•Œ๋ ค์ ธ ์žˆ์„ ๋•Œ์—๋„ ๋ณด์žฅ๋˜์–ด์•ผ ํ•œ๋‹ค. ์ด์— ๋”ฐ๋ผ, ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ํ™•๋ฅ ์  ๋„๋‹ฌ์„ฑ ๋ถ„์„(probabilistic reachability analysis)๊ณผ ์•ˆ์ „ ๊ฐ•ํ™”ํ•™์Šต(safe reinforcement learning)์„ ๊ฒฐํ•ฉํ•˜์—ฌ ์‹œ์Šคํ…œ์ด ์•ˆ์ „ํ•˜๊ฒŒ ๋™์ž‘ํ•  ํ™•๋ฅ ์˜ ์ตœ๋Œ“๊ฐ’์œผ๋กœ ์ •์˜๋˜๋Š” ์•ˆ์ „ ์‚ฌ์–‘์„ ๋ณ„๋„์˜ ๋ชจ๋ธ ์—†์ด ์ถ”์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•๋ก ์„ ์ œ์•ˆํ•œ๋‹ค. ์šฐ๋ฆฌ์˜ ์ ‘๊ทผ๋ฒ•์€ ๋งค๋ฒˆ ์ •์ฑ…์„ ์ƒˆ๋กœ ๊ตฌํ•˜๋Š” ๊ณผ์ •์—์„œ ๊ทธ ๊ฒฐ๊ณผ๋ฌผ์ด ์•ˆ์ „ํ•จ์— ๋Œ€ํ•œ ๊ธฐ์ค€์„ ์ถฉ์กฑ์‹œํ‚ค๋„๋ก ์ œํ•œ์„ ๊ฑฐ๋Š” ๊ฒƒ์œผ๋กœ, ์ด๋ฅผ ์œ„ํ•ด ์•ˆ์ „ํ•œ ์ •์ฑ…์— ๊ด€ํ•œ ๋žดํ‘ธ๋…ธํ”„ ํ•จ์ˆ˜๋ฅผ ๊ตฌ์ถ•ํ•œ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋กœ ์‚ฐ์ถœ๋˜๋Š” ์ผ๋ จ์˜ ์ •์ฑ…์œผ๋กœ๋ถ€ํ„ฐ ์•ˆ์ „ ์ง‘ํ•ฉ(safe set)์ด๋ผ ๋ถˆ๋ฆฌ๋Š” ์•ˆ์ „ํ•œ ๋™์ž‘์ด ๋ณด์žฅ๋˜๋Š” ์˜์—ญ์ด ๊ณ„์‚ฐ๋˜๊ณ , ์ด ์ง‘ํ•ฉ์€ ๋‹จ์กฐ๋กญ๊ฒŒ ํ™•์žฅํ•˜์—ฌ ์ ์ฐจ ์ตœ์ ํ•ด๋กœ ์ˆ˜๋ ดํ•˜๋„๋ก ๋งŒ๋‹ค. ๋˜ํ•œ, ์šฐ๋ฆฌ๋Š” ์กฐ์‚ฌ๋˜์ง€ ์•Š์€ ์ƒํƒœ์˜ ์•ˆ์ „์„ฑ์„ ๋” ๋น ๋ฅด๊ฒŒ ํŒŒ์•…ํ•  ์ˆ˜ ์žˆ๋Š” ํšจ์œจ์ ์ธ ์•ˆ์ „ ํƒ์‚ฌ ์ฒด๊ณ„๋ฅผ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค. ๋žดํ‘ธ๋…ธ๋ธŒ ์ฐจํ๋ฅผ ์ด์šฉํ•œ ๊ฒฐ๊ณผ, ์šฐ๋ฆฌ๊ฐ€ ์ œ์•ˆํ•˜๋Š” ํƒํ—˜ ์ •์ฑ…์€ ๋†’์€ ํ™•๋ฅ ๋กœ ์œ„ํ—˜ํ•˜๋‹ค ์—ฌ๊ฒจ์ง€๋Š” ์ƒํƒœ๋ฅผ ํ”ผํ•˜๋„๋ก ์ œํ•œ์ด ๊ฑธ๋ฆฐ๋‹ค. ์—ฌ๊ธฐ์— ๋”ํ•ด ์šฐ๋ฆฌ๋Š” ๊ณ ์ฐจ์› ์‹œ์Šคํ…œ์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ์ œ์•ˆํ•œ ๋ฐฉ๋ฒ•์„ ์‹ฌ์ธต๊ฐ•ํ™”ํ•™์Šต์œผ๋กœ ํ™•์žฅํ–ˆ๊ณ , ๊ตฌํ˜„ ๊ฐ€๋Šฅํ•œ ์•กํ„ฐ-ํฌ๋ฆฌํ‹ฑ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ๋งŒ๋“ค๊ธฐ ์œ„ํ•ด ๋ผ๊ทธ๋ž‘์ฃผ ์ด์™„๋ฒ•์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค. ๋”๋ถˆ์–ด ๋ณธ ๋ฐฉ๋ฒ•์˜ ์‹คํšจ์„ฑ์€ ์—ฐ์†์ ์ธ ์ œ์–ด ๋ฒค์น˜๋งˆํฌ์ธ 2์ฐจ์› ํ‰๋ฉด์—์„œ ๋™์ž‘ํ•˜๋Š” 2-DOF ๋กœ๋ด‡ ํŒ”์„ ํ†ตํ•ด ์‹คํ—˜์ ์œผ๋กœ ์ž…์ฆ๋˜์—ˆ๋‹ค.Chapter 1 Introduction 1 Chapter 2 Related work 4 Chapter 3 Background 6 3.1 Probabilistic Reachability and Safety Specifications 6 3.2 Safe Reinforcement Learning 8 Chapter 4 Lyapunov-Based Safe Reinforcement Learning for Safety Specification 10 4.1 Lyapunov Safety Specification 11 4.2 Efficient Safe Exploration 14 4.3 Deep RL Implementation 19 Chapter 5 Simulation Studies 23 5.1 Tabular Q-Learning 25 5.2 Deep RL 27 5.3 Experimental Setup 31 5.3.1 Deep RL Implementation 31 5.3.2 Environments 32 Chapter 6 Conclusion 35 Bibliography 35 ์ดˆ๋ก 41 Acknowledgements 42Maste
    • โ€ฆ
    corecore