8,677 research outputs found

    Scheduling Periodic Real-Time Tasks with Heterogeneous Reward Requirements

    Full text link
    Abstract—We study the problem of scheduling periodic real-time tasks which have individual minimum reward requirements. We consider situations where tasks generate jobs that can be provided arbitrary service times before their deadlines, and obtain rewards based on the service times received by the jobs of the task. We show that this model is compatible with the imprecise computation models and the increasing reward with increasing service models. In contrast to previous work on these models, which mainly focus on maximizing the total reward in the system, we additionally aim to fulfill different reward requirements of different tasks. This provides better fairness and also allows fine-grained tradeoff between tasks. We first derive a necessary and sufficient condition for a system with reward requirements of tasks to be feasible. We next obtain an off-line feasibility optimal scheduling policy. We then study a sufficient condition for a policy to be feasibility optimal or achieve some approximation bound. This condition serves as a guideline for designing on-line scheduling policy and we obtain a greedy policy based on it. We prove that the on-line policy is feasibility optimal when all tasks have the same periods, and also obtain an approximation bound for the policy under general cases. We test our policies through comparative simulations. I

    Optimal Time Utility Based Scheduling Policy Design for Cyber-Physical Systems

    Get PDF
    Classical scheduling abstractions such as deadlines and priorities do not readily capture the complex timing semantics found in many real-time cyber-physical systems. Time utility functions provide a necessarily richer description of timing semantics, but designing utility-aware scheduling policies using them is an open research problem. In particular, optimal utility accrual scheduling design is needed for real-time cyber-physical domains. In this paper we design optimal utility accrual scheduling policies for cyber-physical systems with periodic, non-preemptable tasks that run with stochastic duration. These policies are derived by solving a Markov Decision Process formulation of the scheduling problem. We use this formulation to demonstrate that our technique improves on existing heuristic utility accrual scheduling policies

    Deep Reinforcement Learning for Real-Time Optimization in NB-IoT Networks

    Get PDF
    NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based technology that offers a range of flexible configurations for massive IoT radio access from groups of devices with heterogeneous requirements. A configuration specifies the amount of radio resource allocated to each group of devices for random access and for data transmission. Assuming no knowledge of the traffic statistics, there exists an important challenge in "how to determine the configuration that maximizes the long-term average number of served IoT devices at each Transmission Time Interval (TTI) in an online fashion". Given the complexity of searching for optimal configuration, we first develop real-time configuration selection based on the tabular Q-learning (tabular-Q), the Linear Approximation based Q-learning (LA-Q), and the Deep Neural Network based Q-learning (DQN) in the single-parameter single-group scenario. Our results show that the proposed reinforcement learning based approaches considerably outperform the conventional heuristic approaches based on load estimation (LE-URC) in terms of the number of served IoT devices. This result also indicates that LA-Q and DQN can be good alternatives for tabular-Q to achieve almost the same performance with much less training time. We further advance LA-Q and DQN via Actions Aggregation (AA-LA-Q and AA-DQN) and via Cooperative Multi-Agent learning (CMA-DQN) for the multi-parameter multi-group scenario, thereby solve the problem that Q-learning agents do not converge in high-dimensional configurations. In this scenario, the superiority of the proposed Q-learning approaches over the conventional LE-URC approach significantly improves with the increase of configuration dimensions, and the CMA-DQN approach outperforms the other approaches in both throughput and training efficiency

    Human-Machine Collaborative Optimization via Apprenticeship Scheduling

    Full text link
    Coordinating agents to complete a set of tasks with intercoupled temporal and resource constraints is computationally challenging, yet human domain experts can solve these difficult scheduling problems using paradigms learned through years of apprenticeship. A process for manually codifying this domain knowledge within a computational framework is necessary to scale beyond the ``single-expert, single-trainee" apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. We propose a new approach for capturing domain-expert heuristics through a pairwise ranking formulation. Our approach is model-free and does not require enumerating or iterating through a large state space. We empirically demonstrate that this approach accurately learns multifaceted heuristics on a synthetic data set incorporating job-shop scheduling and vehicle routing problems, as well as on two real-world data sets consisting of demonstrations of experts solving a weapon-to-target assignment problem and a hospital resource allocation problem. We also demonstrate that policies learned from human scheduling demonstration via apprenticeship learning can substantially improve the efficiency of a branch-and-bound search for an optimal schedule. We employ this human-machine collaborative optimization technique on a variant of the weapon-to-target assignment problem. We demonstrate that this technique generates solutions substantially superior to those produced by human domain experts at a rate up to 9.5 times faster than an optimization approach and can be applied to optimally solve problems twice as complex as those solved by a human demonstrator.Comment: Portions of this paper were published in the Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper consists of 50 pages with 11 figures and 4 table

    Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey

    Full text link
    Wireless sensor networks (WSNs) consist of autonomous and resource-limited devices. The devices cooperate to monitor one or more physical phenomena within an area of interest. WSNs operate as stochastic systems because of randomness in the monitored environments. For long service time and low maintenance cost, WSNs require adaptive and robust methods to address data exchange, topology formulation, resource and power optimization, sensing coverage and object detection, and security challenges. In these problems, sensor nodes are to make optimized decisions from a set of accessible strategies to achieve design goals. This survey reviews numerous applications of the Markov decision process (MDP) framework, a powerful decision-making tool to develop adaptive algorithms and protocols for WSNs. Furthermore, various solution methods are discussed and compared to serve as a guide for using MDPs in WSNs
    • …
    corecore