16 research outputs found

    R^3: On-device Real-Time Deep Reinforcement Learning for Autonomous Robotics

    Full text link
    Autonomous robotic systems, like autonomous vehicles and robotic search and rescue, require efficient on-device training for continuous adaptation of Deep Reinforcement Learning (DRL) models in dynamic environments. This research is fundamentally motivated by the need to understand and address the challenges of on-device real-time DRL, which involves balancing timing and algorithm performance under memory constraints, as exposed through our extensive empirical studies. This intricate balance requires co-optimizing two pivotal parameters of DRL training -- batch size and replay buffer size. Configuring these parameters significantly affects timing and algorithm performance, while both (unfortunately) require substantial memory allocation to achieve near-optimal performance. This paper presents R^3, a holistic solution for managing timing, memory, and algorithm performance in on-device real-time DRL training. R^3 employs (i) a deadline-driven feedback loop with dynamic batch sizing for optimizing timing, (ii) efficient memory management to reduce memory footprint and allow larger replay buffer sizes, and (iii) a runtime coordinator guided by heuristic analysis and a runtime profiler for dynamically adjusting memory resource reservations. These components collaboratively tackle the trade-offs in on-device DRL training, improving timing and algorithm performance while minimizing the risk of out-of-memory (OOM) errors. We implemented and evaluated R^3 extensively across various DRL frameworks and benchmarks on three hardware platforms commonly adopted by autonomous robotic systems. Additionally, we integrate R^3 with a popular realistic autonomous car simulator to demonstrate its real-world applicability. Evaluation results show that R^3 achieves efficacy across diverse platforms, ensuring consistent latency performance and timing predictability with minimal overhead.Comment: Accepted by RTSS 202

    Mixed Criticality on Multi-cores Accounting for Resource Stress and Resource Sensitivity

    Get PDF
    The most significant trend in real-time systems design in recent years has been the adoption of multi-core processors and the accompanying integration of functionality with different criticality levels onto the same hardware platform. This paper integrates mixed criticality aspects and assurances within a multi-core system model. It bounds cross-core contention and interference by considering the impact on task execution times due to the stress on shared hardware resources caused by co-runners, and each task’s sensitivity to that resource stress. Schedulability analysis is derived for four mixed criticality scheduling schemes based on partitioned fixed priority preemptive scheduling. Each scheme provides robust timing guarantees for high criticality tasks, ensuring that their timing constraints cannot be jeopardized by the behavior or misbehavior of low criticality tasks

    Schedulability Analysis for Multi-Core Systems Accounting for Resource Stress and Sensitivity

    Get PDF
    Timing verification of multi-core systems is complicated by contention for shared hardware resources between co-running tasks on different cores. This paper introduces the Multi-core Resource Stress and Sensitivity (MRSS) task model that characterizes how much stress each task places on resources and how much it is sensitive to such resource stress. This model facilitates a separation of concerns, thus retaining the advantages of the traditional two-step approach to timing verification (i.e. timing analysis followed by schedulability analysis). Response time analysis is derived for the MRSS task model, providing efficient context-dependent and context independent schedulability tests for both fixed priority preemptive and fixed priority non-preemptive scheduling. Dominance relations are derived between the tests, and proofs of optimal priority assignment provided. The MRSS task model is underpinned by a proof-of-concept industrial case study

    Software Fault Tolerance in Real-Time Systems: Identifying the Future Research Questions

    Get PDF
    Tolerating hardware faults in modern architectures is becoming a prominent problem due to the miniaturization of the hardware components, their increasing complexity, and the necessity to reduce the costs. Software-Implemented Hardware Fault Tolerance approaches have been developed to improve the system dependability to hardware faults without resorting to custom hardware solutions. However, these come at the expense of making the satisfaction of the timing constraints of the applications/activities harder from a scheduling standpoint. This paper surveys the current state of the art of fault tolerance approaches when used in the context real-time systems, identifying the main challenges and the cross-links between these two topics. We propose a joint scheduling-failure analysis model that highlights the formal interactions among software fault tolerance mechanisms and timing properties. This model allows us to present and discuss many open research questions with the final aim to spur the future research activities

    ATMP-CA: Optimising Mixed-Criticality Systems Considering Criticality Arithmetic

    Get PDF
    © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)Many safety-critical systems use criticality arithmetic, an informal practice of implementing a higher-criticality function by combining several lower-criticality redundant components or tasks. This lowers the cost of development, but existing mixed-criticality schedulers may act incorrectly as they lack the knowledge that the lower-criticality tasks are operating together to implement a single higher-criticality function. In this paper, we propose a solution to this problem by presenting a mixed-criticality mid-term scheduler that considers where criticality arithmetic is used in the system. As this scheduler, which we term ATMP-CA, is a mid-term scheduler, it changes the configuration of the system when needed based on the recent history of deadline misses. We present the results from a series of experiments that show that ATMP-CA’s operation provides a smoother degradation of service compared with reference schedulers that do not consider the use of criticality arithmeticPeer reviewe

    Scheduling Classifiers for Real-Time Hazard Perception Considering Functional Uncertainty

    Get PDF
    This paper addresses the problem of real-time classification-based machine perception, exemplified by a mobile autonomous system that must continually check that a designated area ahead is free of hazards. Such hazards must be identified within a specified time. In practice, classifiers are imperfect; they exhibit functional uncertainty. In the majority of cases, a given classifier will correctly determine whether there is a hazard or the area ahead is clear. However, in other cases it may produce false positives, i.e. indicate hazard when the area is clear, or false negatives, i.e. indicate clear when there is in fact a hazard. The former are undesirable since they reduce quality of service, whereas the latter are a potential safety concern. A stringent constraint is therefore placed on the maximum permitted probability of false negatives. Since this requirement may not be achievable using a single classifier, one approach is to (logically) OR the outputs of multiple disparate classifiers together, setting the final output to hazard if any of the classifiers indicates hazard. This reduces the probability of false negatives; however, the trade-off is an inevitably increase in the probability of false positives and an increase in the overall execution time required. In this paper, we provide optimal algorithms for the scheduling of classifiers that minimize the probability of false positives, while meeting both a latency constraint and a constraint on the maximum acceptable probability of false negatives. The classifiers may have arbitrary statistical dependences between their functional behaviors (probabilities of correct identification of hazards), as well as variability in their execution times, characterized by typical and worst-case values

    A Framework for Multi-core Schedulability Analysis Accounting for Resource Stress and Sensitivity

    Get PDF
    Timing verification of multi-core systems is complicated by contention for shared hardware resources between co-running tasks on different cores. This paper introduces the Multi-core Resource Stress and Sensitivity (MRSS) task model that characterizes how much stress each task places on resources and how much it is sensitive to such resource stress. This model facilitates a separation of concerns, thus retaining the advantages of the traditional two-step approach to timing verification (i.e. timing analysis followed by schedulability analysis). Response time analysis is derived for the MRSS task model, providing efficient context-dependent and context independent schedulability tests for both fixed priority preemptive and fixed priority non-preemptive scheduling. Dominance relations are derived between the tests, along with complexity results, and proofs of optimal priority assignment policies. The MRSS task model is underpinned by a proof-of-concept industrial case study. The problem of task allocation is considered in the context of the MRSS task model, with Simulated Annealing shown to provide an effective solution

    Improving Performance of Feedback-Based Real-Time Networks using Model Checking and Reinforcement Learning

    Get PDF
    Traditionally, automatic control techniques arose due to need for automation in mechanical systems. These techniques rely on robust mathematical modelling of physical systems with the goal to drive their behaviour to desired set-points. Decades of research have successfully automated, optimized, and ensured safety of a wide variety of mechanical systems. Recent advancement in digital technology has made computers pervasive into every facet of life. As such, there have been many recent attempts to incorporate control techniques into digital technology. This thesis investigates the intersection and co-application of control theory and computer science to evaluate and improve performance of time-critical systems. The thesis applies two different research areas, namely, model checking and reinforcement learning to design and evaluate two unique real-time networks in conjunction with control technologies. The first is a camera surveillance system with the goal of constrained resource allocation to self-adaptive cameras. The second is a dual-delay real-time communication network with the goal of safe packet routing with minimal delays.The camera surveillance system consists of self-adaptive cameras and a centralized manager, in which the cameras capture a stream of images and transmit them to a central manager over a shared constrained communication channel. The event-based manager allocates fractions of the shared bandwidth to all cameras in the network. The thesis provides guarantees on the behaviour of the camera surveillance network through model checking. Disturbances that arise during image capture due to variations in capture scenes are modelled using probabilistic and non-deterministic Markov Decision Processes (MDPs). The different properties of the camera network such as the number of frame drops and bandwidth reallocations are evaluated through formal verification.The second part of the thesis explores packet routing for real-time networks constructed with nodes and directed edges. Each edge in the network consists of two different delays, a worst-case delay that captures high load characteristics, and a typical delay that captures the current network load. Each node in the network takes safe routing decisions by considering delays already encountered and the amount of remaining time. The thesis applies reinforcement learning to route packets through the network with minimal delays while ensuring the total path delay from source to destination does not exceed the pre-determined deadline of the packet. The reinforcement learning algorithm explores new edges to find optimal routing paths while ensuring safety through a simple pre-processing algorithm. The thesis shows that it is possible to apply powerful reinforcement learning techniques to time-critical systems with expert knowledge about the system

    Optimal Scheduling to Manage an Electric Bus Fleet Overnight Charging

    Get PDF
    Electro-mobility is increasing significantly in the urban public transport and continues to face important challenges. Electric bus fleets require high performance and extended longevity of lithium-ion battery at highly variable temperature and in different operating conditions. On the other hand, bus operators are more concerned about reducing operation and maintenance costs, which affects the battery aging cost and represents a significant economic parameter for the deployment of electric bus fleets. This paper introduces a methodological approach to manage overnight charging of an electric bus fleet. This approach identifies an optimal charging strategy that minimizes the battery aging cost (the cost of replacing the battery spread over the battery lifetime). The optimization constraints are related to the bus operating conditions, the electric vehicle supply equipment, and the power grid. The optimization evaluates the fitness function through the coupled modeling of electro-thermal and aging properties of lithium-ion batteries. Simulation results indicate a significant reduction in the battery capacity loss over 10 years of operation for the optimal charging strategy compared to three typical charging strategies
    corecore