8,677 research outputs found
Scheduling Periodic Real-Time Tasks with Heterogeneous Reward Requirements
Abstract—We study the problem of scheduling periodic real-time tasks which have individual minimum reward requirements. We consider situations where tasks generate jobs that can be provided arbitrary service times before their deadlines, and obtain rewards based on the service times received by the jobs of the task. We show that this model is compatible with the imprecise computation models and the increasing reward with increasing service models. In contrast to previous work on these models, which mainly focus on maximizing the total reward in the system, we additionally aim to fulfill different reward requirements of different tasks. This provides better fairness and also allows fine-grained tradeoff between tasks. We first derive a necessary and sufficient condition for a system with reward requirements of tasks to be feasible. We next obtain an off-line feasibility optimal scheduling policy. We then study a sufficient condition for a policy to be feasibility optimal or achieve some approximation bound. This condition serves as a guideline for designing on-line scheduling policy and we obtain a greedy policy based on it. We prove that the on-line policy is feasibility optimal when all tasks have the same periods, and also obtain an approximation bound for the policy under general cases. We test our policies through comparative simulations. I
Optimal Time Utility Based Scheduling Policy Design for Cyber-Physical Systems
Classical scheduling abstractions such as deadlines and priorities do not readily capture the complex timing semantics found in many real-time cyber-physical systems. Time utility functions provide a necessarily richer description of timing semantics, but designing utility-aware scheduling policies using them is an open research problem. In particular, optimal utility accrual scheduling design is needed for real-time cyber-physical domains. In this paper we design optimal utility accrual scheduling policies for cyber-physical systems with periodic, non-preemptable tasks that run with stochastic duration. These policies are derived by solving a Markov Decision Process formulation of the scheduling problem. We use this formulation to demonstrate that our technique improves on existing heuristic utility accrual scheduling policies
Deep Reinforcement Learning for Real-Time Optimization in NB-IoT Networks
NarrowBand-Internet of Things (NB-IoT) is an emerging cellular-based
technology that offers a range of flexible configurations for massive IoT radio
access from groups of devices with heterogeneous requirements. A configuration
specifies the amount of radio resource allocated to each group of devices for
random access and for data transmission. Assuming no knowledge of the traffic
statistics, there exists an important challenge in "how to determine the
configuration that maximizes the long-term average number of served IoT devices
at each Transmission Time Interval (TTI) in an online fashion". Given the
complexity of searching for optimal configuration, we first develop real-time
configuration selection based on the tabular Q-learning (tabular-Q), the Linear
Approximation based Q-learning (LA-Q), and the Deep Neural Network based
Q-learning (DQN) in the single-parameter single-group scenario. Our results
show that the proposed reinforcement learning based approaches considerably
outperform the conventional heuristic approaches based on load estimation
(LE-URC) in terms of the number of served IoT devices. This result also
indicates that LA-Q and DQN can be good alternatives for tabular-Q to achieve
almost the same performance with much less training time. We further advance
LA-Q and DQN via Actions Aggregation (AA-LA-Q and AA-DQN) and via Cooperative
Multi-Agent learning (CMA-DQN) for the multi-parameter multi-group scenario,
thereby solve the problem that Q-learning agents do not converge in
high-dimensional configurations. In this scenario, the superiority of the
proposed Q-learning approaches over the conventional LE-URC approach
significantly improves with the increase of configuration dimensions, and the
CMA-DQN approach outperforms the other approaches in both throughput and
training efficiency
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
- …