Search CORE

28 research outputs found

DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

Author: Chen Shaoru
Fazlyab Mahyar
Kolter J. Zico
Wong Eric
Publication venue
Publication date: 16/06/2021
Field of study

Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In this work, we propose a novel operator splitting method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller sub-problems that often have analytical solutions. The method is modular and scales to problem instances that were previously impossible to solve exactly due to their size. Furthermore, the solver operations are amenable to fast parallelization with GPU acceleration. We demonstrate our method in obtaining tighter bounds on the worst-case performance of large convolutional networks in image classification and reinforcement learning settings

arXiv.org e-Print Archive

Directory of Open Access Journals

Enhancing the Performance of Multi-Agent Reinforcement Learning for Controlling HVAC Systems

Author: Bayer Daniel
Pruckner Marco
Publication venue
Publication date: 13/09/2023
Field of study

Systems for heating, ventilation and air-conditioning (HVAC) of buildings are traditionally controlled by a rule-based approach. In order to reduce the energy consumption and the environmental impact of HVAC systems more advanced control methods such as reinforcement learning are promising. Reinforcement learning (RL) strategies offer a good alternative, as user feedback can be integrated more easily and presence can also be incorporated. Moreover, multi-agent RL approaches scale well and can be generalized. In this paper, we propose a multi-agent RL framework based on existing work that learns reducing on one hand energy consumption by optimizing HVAC control and on the other hand user feedback by occupants about uncomfortable room temperatures. Second, we show how to reduce training time required for proper RL-agent-training by using parameter sharing between the multiple agents and apply different pretraining techniques. Results show that our framework is capable of reducing the energy by around 6% when controlling a complete building or 8% for a single room zone. The occupants complaints are acceptable or even better compared to a rule-based baseline. Additionally, our performance analysis show that the training time can be drastically reduced by using parameter sharing

arXiv.org e-Print Archive

Scalable Evolutionary Hierarchical Reinforcement Learning

Author: Abramowitz Sasha
Nitschke Geoff
Publication venue
Publication date: 01/01/2022
Field of study

This paper investigates a novel method combining Scalable Evolution Strategies (S-ES) and Hierarchical Reinforcement Learning (HRL). S-ES, named for its excellent scalability, was popularised with demonstrated performance comparable to state-of-the-art policy gradient methods. However, S-ES has not been tested in conjunction with HRL methods, which empower temporal abstraction thus allowing agents to tackle more challenging problems. We introduce a novel method merging S-ES and HRL, which creates a highly scalable and efficient (compute time) algorithm. We demonstrate that the proposed method benefits from S-ES’s scalability and indifference to delayed rewards. This results in our main contribution: significantly higher learning speed and competitive performance compared to gradient-based HRL methods, across a range of tasks

UCT Computer Science Research Document Archive

Towards Run-time Efficient Hierarchical Reinforcement Learning

Author: Abramowitz Sasha
Nitschke Geoff
Publication venue
Publication date: 01/01/2022
Field of study

UCT Computer Science Research Document Archive

Autonomous Drone Landings on an Unmanned Marine Vehicle using Deep Reinforcement Learning

Author: Polvara Riccardo
Publication venue: 'University of Plymouth'
Publication date: 01/01/2019
Field of study

This thesis describes with the integration of an Unmanned Surface Vehicle (USV) and an Unmanned Aerial Vehicle (UAV, also commonly known as drone) in a single Multi-Agent System (MAS). In marine robotics, the advantage offered by a MAS consists of exploiting the key features of a single robot to compensate for the shortcomings in the other. In this way, a USV can serve as the landing platform to alleviate the need for a UAV to be airborne for long periods time, whilst the latter can increase the overall environmental awareness thanks to the possibility to cover large portions of the prevailing environment with a camera (or more than one) mounted on it. There are numerous potential applications in which this system can be used, such as deployment in search and rescue missions, water and coastal monitoring, and reconnaissance and force protection, to name but a few. The theory developed is of a general nature. The landing manoeuvre has been accomplished mainly identifying, through artificial vision techniques, a fiducial marker placed on a flat surface serving as a landing platform. The raison d'etre for the thesis was to propose a new solution for autonomous landing that relies solely on onboard sensors and with minimum or no communications between the vehicles. To this end, initial work solved the problem while using only data from the cameras mounted on the in-flight drone. In the situation in which the tracking of the marker is interrupted, the current position of the USV is estimated and integrated into the control commands. The limitations of classic control theory used in this approached suggested the need for a new solution that empowered the flexibility of intelligent methods, such as fuzzy logic or artificial neural networks. The recent achievements obtained by deep reinforcement learning (DRL) techniques in end-to-end control in playing the Atari video-games suite represented a fascinating while challenging new way to see and address the landing problem. Therefore, novel architectures were designed for approximating the action-value function of a Q-learning algorithm and used to map raw input observation to high-level navigation actions. In this way, the UAV learnt how to land from high latitude without any human supervision, using only low-resolution grey-scale images and with a level of accuracy and robustness. Both the approaches have been implemented on a simulated test-bed based on Gazebo simulator and the model of the Parrot AR-Drone. The solution based on DRL was further verified experimentally using the Parrot Bebop 2 in a series of trials. The outcomes demonstrate that both these innovative methods are both feasible and practicable, not only in an outdoor marine scenario but also in indoor ones as well

Plymouth Electronic Archive and Research Library

Recommended from our members

Algorithms for Optimal Paths of One, Many, and an Infinite Number of Agents

Author: Lin Alex Tong
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

In this dissertation, we provide efficient algorithms for modeling the behavior of a single agent, multiple agents, and a continuum of agents. For a single agent, we combine the modeling framework of optimal control with advances in optimization splitting in order to efficiently find optimal paths for problems in very high-dimensions, thus providing alleviation from the curse of dimensionality. For a multiple, but finite, number of agents, we take the framework of multi-agent reinforcement learning and utilize imitation learning in order to decentralize a centralized expert, thus obtaining optimal multi-agents that act in a decentralized fashion. For a continuum of agents, we take the framework of mean-field games and use two neural networks, which we train in an alternating scheme, in order to efficiently find optimal paths for high-dimensional and stochastic problems. These tools cover a wide variety of use-cases that can be immediately deployed for practical applications

eScholarship - University of California