190 research outputs found
Wardrop Equilibrium in Discrete-Time Selfish Routing with Time-Varying Bounded Delays
This paper presents a multi-commodity, discrete-
time, distributed and non-cooperative routing algorithm, which is
proved to converge to an equilibrium in the presence of
heterogeneous, unknown, time-varying but bounded delays.
Under mild assumptions on the latency functions which describe
the cost associated to the network paths, two algorithms are
proposed: the former assumes that each commodity relies only on
measurements of the latencies associated to its own paths; the
latter assumes that each commodity has (at least indirectly) access
to the measures of the latencies of all the network paths. Both
algorithms are proven to drive the system state to an invariant set
which approximates and contains the Wardrop equilibrium,
defined as a network state in which no traffic flow over the
network paths can improve its routing unilaterally, with the latter
achieving a better reconstruction of the Wardrop equilibrium.
Numerical simulations show the effectiveness of the proposed
approach
Chance-Constrained Control with Lexicographic Deep Reinforcement Learning
This paper proposes a lexicographic Deep Reinforcement Learning (DeepRL)-based approach to chance-constrained Markov Decision Processes, in which the controller seeks to ensure that the probability of satisfying the constraint is above a given threshold. Standard DeepRL approaches require i) the constraints to be included as additional weighted terms in the cost function, in a multi-objective fashion, and ii) the tuning of the introduced weights during the training phase of the Deep Neural Network (DNN) according to the probability thresholds. The proposed approach, instead, requires to separately train one constraint-free DNN and one DNN associated to each constraint and then, at each time-step, to select which DNN to use depending on the system observed state. The presented solution does not require any hyper-parameter tuning besides the standard DNN ones, even if the probability thresholds changes. A lexicographic version of the well-known DeepRL algorithm DQN is also proposed and validated via simulations
Bellman's principle of optimality and deep reinforcement learning for time-varying tasks
This paper presents the first framework (up to the authors' knowledge) to address time-varying objectives in finite-horizon Deep Reinforcement Learning (DeepRL), based on a switching control solution developed on the ground of Bellman's principle of optimality. By augmenting the state space of the system with information on its visit time, the DeepRL agent is able to solve problems in which its task dynamically changes within the same episode. To address the scalability problems caused by the state space augmentation, we propose a procedure to partition the episode length to define separate sub-problems that are then solved by specialised DeepRL agents. Contrary to standard solutions, with the proposed approach the DeepRL agents correctly estimate the value function at each time-step and are hence able to solve time-varying tasks. Numerical simulations validate the approach in a classic RL environment
Smart Healthy Schools: An IoT-enabled concept for multi-room dynamic air quality control
Smart Healthy Schools (SHS) are a new paradigm in building engineering and infection risk control in school buildings where the disciplines of Indoor Air Quality (IAQ), IoT (Internet of Things) and Artificial Intelligence (AI) merge together. In the post-pandemic era, equipping schools with a network of smart IoT sensors has become critical to aspire for the optimal control of the IAQ and lowering the airborne infection risk of several pathogens, indirectly related to cumulated human emitted CO2 levels over time. Thermal energy waste in winter due to improved air renewal remains of major concern but can be well monitored within a SHS monitoring architecture thanks to the flexibility of the LoRaWAN protocol able to process also a large amount of energy and climatic data at room and building scale. In this work, we report the design of the AulaSicura platform, an IoT control system co-designed by the main author and Gizero Energie to implement the SHS paradigm via clearly visible (and audible) alarm signalling in existing and new school buildings. The cloud-based LoRa system is capable of continuous and simultaneous monitoring of a variety of sensors and IAQ parameters including indoor/oudoor temperatures, rel. humidities and human-emitted excess CO2. The multi-room monitoring concept of indoor-CO2 levels allows centralized control of natural ventilation levels in individual classrooms and can handle (quasi)-real-time data, relevant for data post-processing and future developments in (quasi)-real-rime assessment of IAQ and infection risk levels at single room scale. The sensor network is also extensible to up to one thousand of classrooms per LoRa-node allowing centralized control of entire school districts at an urban scale. Moreover, through Modbus-LoRa I/O converters, AulaSicura can also control the same amount of mechanical ventilation units per node either in pure or hybrid mechanical ventilation modes
Efficient and Risk-Aware Control of Electricity Distribution Grids
This article presents an economic model predictive control (EMPC) algorithm for reducing losses and increasing the resilience of medium-voltage electricity distribution grids characterized by high penetration of renewable energy sources and possibly subject to natural or malicious adverse events. The proposed control system optimizes grid operations through network reconfiguration, control of distributed energy storage systems (ESSs), and on-load tap changers. The core of the EMPC algorithm is a nonconvex optimization problem integrating the ESSs dynamics, the topological and power technical constraints of the grid, and the modeling of the cascading effects of potential adverse events. An equivalent (i.e., having the same optimal solution) proxy of the nonconvex problem is proposed to make the solution more tractable. Simulations performed on a 16-bus test distribution network validate the proposed control strategy
Distributed MARL with Limited Sensing for Robot Navigation Problems
This paper proposes a Multi-Agent Reinforcement Learning (MARL) algorithm for the multi-robot navigation problem. Most of the proposals in the literature requires some form of information sharing and communications among agents to coordinate their action in order to complete the overall task. The proposed paper, named Limited Sensing MARL (LS-MARL), assumes that each robot decisions rely on local information and is provided with sensor, which can be switched on for the localization of the robots within a given range. Besides the navigation task, each agent aims at limiting the use of the sensor as much as possible (i.e., to be as independent as possible) for energy saving or safety reasons. The algorithm is evaluated by simulations and favourably compares to the one proposed in (Yu et al. (2015)), that assumes a similar setup in which the neighbouring agents share their positioning information
Ensuring the Stability of Power Systems Against Dynamic Load Altering Attacks: A Robust Control Scheme Using Energy Storage Systems
This paper presents a robust protection scheme to protect the power transmission network against a class of feedback-based attacks referred in the literature as "Dynamic Load Altering Attacks" (D-LAAs). The proposed scheme envisages the usage of Energy Storage Systems (ESSs) to avoid the destabilising effects that a malicious state feedback has on the power network generators. The methodologies utilised are based on results from polytopic uncertain systems, invariance theory and Lyapunov arguments. Numerical simulations on a test scenario validate the proposed approach
Automated Optical Inspection for Printed Circuit Board Assembly Manufacturing with Transfer Learning and Synthetic Data Generation
Automated Optical Inspection (AOI) is among the most common and effective quality checks employed in production lines. This paper details the design of a Deep Learning solution that was developed for addressing a specific quality control in a Printed Circuit Board Assembly (PCBA) manufacturing process. The developed Deep Neural Network exploits transfer learning and a synthetic data generation process to be trained even if the quantity of the data samples available is low. The overall AOI system was designed to be deployed on low-cost hardware with limited computing capabilities to ease its deployment in industrial settings
Automatic Transportation Mode Recognition on Smartphone Data Based on Deep Neural Networks
In the last few years, with the exponential diffusion of smartphones, services for turn-by-turn navigation have seen a surge in popularity. Current solutions available in the market allow the user to select via an interface the desired transportation mode, for which an optimal route is then computed. Automatically recognizing the transportation system that the user is travelling by allows to dynamically control, and consequently update, the route proposed to the user. Such a dynamic approach is an enabling technology for multi-modal transportation planners, in which the optimal path and its associated transportation solutions are updated in real-time based on data coming from (i) distributed sensors (e.g., smart traffic lights, road congestion sensors, etc.); (ii) service providers (e.g., car-sharing availability, bus waiting time, etc.); and (iii) the user’s own device, in compliance with the development of smart cities envisaged by the 5G architecture. In this paper, we present a series of Machine Learning approaches for real-time Transportation Mode Recognition and we report their performance difference in our field tests. Several Machine Learning-based classifiers, including Deep Neural Networks, built on both statistical feature extraction and raw data analysis are presented and compared in this paper; the result analysis also highlights which features are proven to be the most informative ones for the classification
Deep reinforcement learning control of white-light continuum generation
White-light continuum (WLC) generation in bulk media finds numerous applications in ultrafast optics and spectroscopy. Due to the complexity of the underlying spatiotemporal dynamics, WLC optimization typically follows empirical procedures. Deep reinforcement learning (RL) is a branch of machine learning dealing with the control of automated systems using deep neural networks. In this Letter, we demonstrate the capability of a deep RL agent to generate a long-term-stable WLC from a bulk medium without any previous knowledge of the system dynamics or functioning. This work demonstrates that RL can be exploited effectively to control complex nonlinear optical experiments
- …