45,836 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability
Internet-of-Things (IoT) envisions an intelligent infrastructure of networked
smart devices offering task-specific monitoring and control services. The
unique features of IoT include extreme heterogeneity, massive number of
devices, and unpredictable dynamics partially due to human interaction. These
call for foundational innovations in network design and management. Ideally, it
should allow efficient adaptation to changing environments, and low-cost
implementation scalable to massive number of devices, subject to stringent
latency constraints. To this end, the overarching goal of this paper is to
outline a unified framework for online learning and management policies in IoT
through joint advances in communication, networking, learning, and
optimization. From the network architecture vantage point, the unified
framework leverages a promising fog architecture that enables smart devices to
have proximity access to cloud functionalities at the network edge, along the
cloud-to-things continuum. From the algorithmic perspective, key innovations
target online approaches adaptive to different degrees of nonstationarity in
IoT dynamics, and their scalable model-free implementation under limited
feedback that motivates blind or bandit approaches. The proposed framework
aspires to offer a stepping stone that leads to systematic designs and analysis
of task-specific learning and management schemes for IoT, along with a host of
new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive
and Scalable Communication Network
Convergence Analysis of Mixed Timescale Cross-Layer Stochastic Optimization
This paper considers a cross-layer optimization problem driven by
multi-timescale stochastic exogenous processes in wireless communication
networks. Due to the hierarchical information structure in a wireless network,
a mixed timescale stochastic iterative algorithm is proposed to track the
time-varying optimal solution of the cross-layer optimization problem, where
the variables are partitioned into short-term controls updated in a faster
timescale, and long-term controls updated in a slower timescale. We focus on
establishing a convergence analysis framework for such multi-timescale
algorithms, which is difficult due to the timescale separation of the algorithm
and the time-varying nature of the exogenous processes. To cope with this
challenge, we model the algorithm dynamics using stochastic differential
equations (SDEs) and show that the study of the algorithm convergence is
equivalent to the study of the stochastic stability of a virtual stochastic
dynamic system (VSDS). Leveraging the techniques of Lyapunov stability, we
derive a sufficient condition for the algorithm stability and a tracking error
bound in terms of the parameters of the multi-timescale exogenous processes.
Based on these results, an adaptive compensation algorithm is proposed to
enhance the tracking performance. Finally, we illustrate the framework by an
application example in wireless heterogeneous network
Short-term Self-Scheduling of Virtual Energy Hub Plant within Thermal Energy Market
Multicarrier energy systems create new challenges as well as opportunities in future energy systems. One of these challenges is the interaction among multiple energy systems and energy hubs in different energy markets. By the advent of the local thermal energy market in many countries, energy hubs' scheduling becomes more prominent. In this article, a new approach to energy hubs' scheduling is offered, called virtual energy hub (VEH). The proposed concept of the energy hub, which is named as the VEH in this article, is referred to as an architecture based on the energy hub concept beside the proposed self-scheduling approach. The VEH is operated based on the different energy carriers and facilities as well as maximizes its revenue by participating in the various local energy markets. The proposed VEH optimizes its revenue from participating in the electrical and thermal energy markets and by examining both local markets. Participation of a player in the energy markets by using the integrated point of view can be reached to a higher benefit and optimal operation of the facilities in comparison with independent energy systems. In a competitive energy market, a VEH optimizes its self-scheduling problem in order to maximize its benefit considering uncertainties related to renewable resources. To handle the problem under uncertainty, a nonprobabilistic information gap method is implemented in this study. The proposed model enables the VEH to pursue two different strategies concerning uncertainties, namely risk-averse strategy and risk-seeker strategy. For effective participation of the renewable-based VEH plant in the local energy market, a compressed air energy storage unit is used as a solution for the volatility of the wind power generation. Finally, the proposed model is applied to a test case, and the numerical results validate the proposed approach
- …