45,836 research outputs found

    Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

    Full text link
    Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

    Learning and Management for Internet-of-Things: Accounting for Adaptivity and Scalability

    Get PDF
    Internet-of-Things (IoT) envisions an intelligent infrastructure of networked smart devices offering task-specific monitoring and control services. The unique features of IoT include extreme heterogeneity, massive number of devices, and unpredictable dynamics partially due to human interaction. These call for foundational innovations in network design and management. Ideally, it should allow efficient adaptation to changing environments, and low-cost implementation scalable to massive number of devices, subject to stringent latency constraints. To this end, the overarching goal of this paper is to outline a unified framework for online learning and management policies in IoT through joint advances in communication, networking, learning, and optimization. From the network architecture vantage point, the unified framework leverages a promising fog architecture that enables smart devices to have proximity access to cloud functionalities at the network edge, along the cloud-to-things continuum. From the algorithmic perspective, key innovations target online approaches adaptive to different degrees of nonstationarity in IoT dynamics, and their scalable model-free implementation under limited feedback that motivates blind or bandit approaches. The proposed framework aspires to offer a stepping stone that leads to systematic designs and analysis of task-specific learning and management schemes for IoT, along with a host of new research directions to build on.Comment: Submitted on June 15 to Proceeding of IEEE Special Issue on Adaptive and Scalable Communication Network

    Convergence Analysis of Mixed Timescale Cross-Layer Stochastic Optimization

    Full text link
    This paper considers a cross-layer optimization problem driven by multi-timescale stochastic exogenous processes in wireless communication networks. Due to the hierarchical information structure in a wireless network, a mixed timescale stochastic iterative algorithm is proposed to track the time-varying optimal solution of the cross-layer optimization problem, where the variables are partitioned into short-term controls updated in a faster timescale, and long-term controls updated in a slower timescale. We focus on establishing a convergence analysis framework for such multi-timescale algorithms, which is difficult due to the timescale separation of the algorithm and the time-varying nature of the exogenous processes. To cope with this challenge, we model the algorithm dynamics using stochastic differential equations (SDEs) and show that the study of the algorithm convergence is equivalent to the study of the stochastic stability of a virtual stochastic dynamic system (VSDS). Leveraging the techniques of Lyapunov stability, we derive a sufficient condition for the algorithm stability and a tracking error bound in terms of the parameters of the multi-timescale exogenous processes. Based on these results, an adaptive compensation algorithm is proposed to enhance the tracking performance. Finally, we illustrate the framework by an application example in wireless heterogeneous network

    Short-term Self-Scheduling of Virtual Energy Hub Plant within Thermal Energy Market

    Get PDF
    Multicarrier energy systems create new challenges as well as opportunities in future energy systems. One of these challenges is the interaction among multiple energy systems and energy hubs in different energy markets. By the advent of the local thermal energy market in many countries, energy hubs' scheduling becomes more prominent. In this article, a new approach to energy hubs' scheduling is offered, called virtual energy hub (VEH). The proposed concept of the energy hub, which is named as the VEH in this article, is referred to as an architecture based on the energy hub concept beside the proposed self-scheduling approach. The VEH is operated based on the different energy carriers and facilities as well as maximizes its revenue by participating in the various local energy markets. The proposed VEH optimizes its revenue from participating in the electrical and thermal energy markets and by examining both local markets. Participation of a player in the energy markets by using the integrated point of view can be reached to a higher benefit and optimal operation of the facilities in comparison with independent energy systems. In a competitive energy market, a VEH optimizes its self-scheduling problem in order to maximize its benefit considering uncertainties related to renewable resources. To handle the problem under uncertainty, a nonprobabilistic information gap method is implemented in this study. The proposed model enables the VEH to pursue two different strategies concerning uncertainties, namely risk-averse strategy and risk-seeker strategy. For effective participation of the renewable-based VEH plant in the local energy market, a compressed air energy storage unit is used as a solution for the volatility of the wind power generation. Finally, the proposed model is applied to a test case, and the numerical results validate the proposed approach
    • …
    corecore