20,056 research outputs found

    Deep Reinforcement Learning for Wireless Sensor Scheduling in Cyber-Physical Systems

    Full text link
    In many Cyber-Physical Systems, we encounter the problem of remote state estimation of geographically distributed and remote physical processes. This paper studies the scheduling of sensor transmissions to estimate the states of multiple remote, dynamic processes. Information from the different sensors have to be transmitted to a central gateway over a wireless network for monitoring purposes, where typically fewer wireless channels are available than there are processes to be monitored. For effective estimation at the gateway, the sensors need to be scheduled appropriately, i.e., at each time instant one needs to decide which sensors have network access and which ones do not. To address this scheduling problem, we formulate an associated Markov decision process (MDP). This MDP is then solved using a Deep Q-Network, a recent deep reinforcement learning algorithm that is at once scalable and model-free. We compare our scheduling algorithm to popular scheduling algorithms such as round-robin and reduced-waiting-time, among others. Our algorithm is shown to significantly outperform these algorithms for many example scenarios

    Hazard Contribution Modes of Machine Learning Components

    Get PDF
    Amongst the essential steps to be taken towards developing and deploying safe systems with embedded learning-enabled components (LECs) i.e., software components that use ma- chine learning (ML)are to analyze and understand the con- tribution of the constituent LECs to safety, and to assure that those contributions have been appropriately managed. This paper addresses both steps by, first, introducing the notion of hazard contribution modes (HCMs) a categorization of the ways in which the ML elements of LECs can contribute to hazardous system states; and, second, describing how argumentation patterns can capture the reasoning that can be used to assure HCM mitigation. Our framework is generic in the sense that the categories of HCMs developed i) can admit different learning schemes, i.e., supervised, unsupervised, and reinforcement learning, and ii) are not dependent on the type of system in which the LECs are embedded, i.e., both cyber and cyber-physical systems. One of the goals of this work is to serve a starting point for systematizing L analysis towards eventually automating it in a tool

    Resource offload consolidation based on deep-reinforcement learning approach in cyber-physical systems.

    Get PDF
    In cyber-physical systems, it is advantageous to leverage cloud with edge resources to distribute the workload for processing and computing user data at the point of generation. Services offered by cloud are not flexible enough against variations in the size of underlying data, which leads to increased latency, violation of deadline and higher cost. On the other hand, resolving above-mentioned issues with edge devices with limited resources is also challenging. In this work, a novel reinforcement learning algorithm, Capacity-Cost Ratio-Reinforcement Learning (CCR-RL), is proposed which considers both resource utilization and cost for the target cyber-physical systems. In CCR-RL, the task offloading decision is made considering data arrival rate, edge device computation power, and underlying transmission capacity. Then, a deep learning model is created to allocate resources based on the underlying communication and computation rate. Moreover, new algorithms are proposed to regulate the allocation of communication and computation resources for the workload among edge devices and edge servers. The simulation results demonstrate that the proposed method can achieve a minimal latency and a reduced processing cost compared to the state-of-the-art schemes

    RAMARL: Robustness Analysis with Multi-Agent Reinforcement Learning - Robust Reasoning in Autonomous Cyber-Physical Systems

    Get PDF
    A key driver to offering smart services is an infrastructure of Cyber-Physical systems (CPS)s. By definition, CPSs are intertwined physical and computational components that integrate physical behaviour with computation. The reason is to autonomously execute a task or a set of tasks providing a service or a list of end-users services. In real-life applications, CPSs operate in dynamically changing surroundings characterized by unexpected or unpredictable situations. Such operations involve complex interactions between multiple intelligent agents in a highly non-stationary environment. For safety reasons, a CPS should withstand a certain amount of disruption and exert the operations in a stable and robust manner when performing complex tasks. Recent advances in reinforcement learning have proven suitable for enabling multi-agents to robustly adapt to their environment, yet they often depend on a massive amount of training data and experiences. In these cases, robustness analysis outlines necessary components and specifications in a framework, ensuring reliable and stable behaviour while considering the dynamicity of the environment. This paper presents a combination of multi-agent reinforcement learning with robustness analysis shaping a cyber-physical system infrastructure that reasons robustly in a dynamically changing environment. The combination strengthens the reinforcement learning, increasing the reliability and flexibility of the system by applying robustness analysis. Robustness analysis identifies vulnerability issues when the system interacts within a dynamically changing environment. Based on this identification, when incorporated into the system, robustness analysis suggests robust solutions and actions rather than optimal ones provided by reinforcement learning alone. Results from the combination show that this infrastructure can enable reliable operations with the flexibility to adapt to the changing environment dynamics.publishedVersio

    USING REINFORCEMENT LEARNING TO SPOOF A MONITORED KALMAN FILTER

    Get PDF
    Modern hardware systems rely on state estimators such as Kalman filters to monitor key variables for feedback and performance monitoring. The performance of the hardware system can be monitored using a chi-squared fault detection test. Previous work has shown that Kalman filters are susceptible to false data injection attacks. In a false data injection attack, intentional noise and/or bias is added to sensor measurement data to mislead a Kalman filter in a way that goes undetected by the chi-squared test. This thesis proposes a method to deceive a Kalman filter where the attack data is generated using reinforcement learning. It is shown that reinforcement learning can be used to train an agent to manipulate the output of a Kalman filter via false data injection and without being detected by the chi-squared test. This result shows that machine learning can be used to successfully perform a cyber-physical attack by an actor who does not need to have in-depth knowledge and understanding of mathematics governing the operation of the target system. This result has significant real-world impact as modern smart power grids, aircraft, car, and spacecraft control systems are all cyber-physical systems that rely on trustworthy sensor data to function safely and reliably. A machine learning derived false data injection attack against any of these systems could lead to an undetected and potentially catastrophic failure.DoD SpaceLieutenant, United States NavyApproved for public release. Distribution is unlimited

    Flexible operation and maintenance optimization of aging cyber-physical energy systems by deep reinforcement learning

    Get PDF
    Cyber-Physical Energy Systems (CPESs) integrate cyber and hardware components to ensure a reliable and safe physical power production and supply. Renewable Energy Sources (RESs) add uncertainty to energy demand that can be dealt with flexible operation (e.g., load-following) of CPES; at the same time, scenarios that could result in severe consequences due to both component stochastic failures and aging of the cyber system of CPES (commonly overlooked) must be accounted for Operation & Maintenance (O&M) planning. In this paper, we make use of Deep Reinforcement Learning (DRL) to search for the optimal O&M strategy that, not only considers the actual system hardware components health conditions and their Remaining Useful Life (RUL), but also the possible accident scenarios caused by the failures and the aging of the hardware and the cyber components, respectively. The novelty of the work lies in embedding the cyber aging model into the CPES model of production planning and failure process; this model is used to help the RL agent, trained with Proximal Policy Optimization (PPO) and Imitation Learning (IL), finding the proper rejuvenation timing for the cyber system accounting for the uncertainty of the cyber system aging process. An application is provided, with regards to the Advanced Lead-cooled Fast Reactor European Demonstrator (ALFRED)
    • …
    corecore