13 research outputs found

    Control of a Mixed Autonomy Signalised Urban Intersection: An Action-Delayed Reinforcement Learning Approach

    Full text link
    We consider a mixed autonomy scenario where the traffic intersection controller decides whether the traffic light will be green or red at each lane for multiple traffic-light blocks. The objective of the traffic intersection controller is to minimize the queue length at each lane and maximize the outflow of vehicles over each block. We consider that the traffic intersection controller informs the autonomous vehicle (AV) whether the traffic light will be green or red for the future traffic-light block. Thus, the AV can adapt its dynamics by solving an optimal control problem. We model the decision process of the traffic intersection controller as a deterministic delay Markov decision process owing to the delayed action by the traffic controller. We propose Reinforcement-learning based algorithm to obtain the optimal policy. We show - empirically - that our algorithm converges and reduces the energy costs of AVs drastically as the traffic controller communicates with the AVs.Comment: Accepted for Publication at 24th IEEE International Conference on Intelligent Transportation (ITSC'2021

    Event-Triggered Action-Delayed Reinforcement Learning Control of a Mixed Autonomy Signalised Urban Intersection

    Get PDF
    We propose an event-triggered framework for deciding the traffic light at each lane in a mixed autonomy scenario. We deploy the decision after a suitable delay, and events are triggered based on the satisfaction of a predefined set of conditions. We design the trigger conditions and the delay to increase the vehicles’ throughput. This way, we achieve full exploitation of autonomous vehicles (AVs) potential. The ultimate goal is to obtain vehicle-flows led by AVs at the head. We formulate the decision process of the traffic intersection controller as a deterministic delayed Markov decision process, i.e., the action implementation and evaluation are delayed. We propose a Reinforcement Learning based model-free algorithm to obtain the optimal policy. We show - by simulations - that our algorithm converges, and significantly reduces the average wait-time and the queues length as the fraction of the AVs increases. Our algorithm outperforms our previous work [1] by a quite significant amount

    Model-free feedback control synthesis from expert demonstration

    Get PDF
    We show how it is possible to synthesize a stabilizing feedback control, in the complete absence of a model, starting from the open-loop control generated by an expert operator, capable of driving a system to a specific set-point. We assume that the system is linear and discrete time. We propose two different controls: a linear dynamic and a static, piecewise linear, one. We show the performance of the proposed controllers on a ship steering problem

    Crossing the Reality Gap: a Survey on Sim-to-Real Transferability of Robot Controllers in Reinforcement Learning

    Get PDF
    The growing demand for robots able to act autonomously in complex scenarios has widely accelerated the introduction of Reinforcement Learning (RL) in robots control applications. However, the trial and error intrinsic nature of RL may result in long training time on real robots and, moreover, it may lead to dangerous outcomes. While simulators are useful tools to accelerate RL training and to ensure safety, they often are provided only with an approximated model of robot dynamics and of its interaction with the surrounding environment, thus resulting in what is called the reality gap (RG): a mismatch of simulated and real control-law performances caused by the inaccurate representation of the real environment in simulation. The most undesirable result occurs when the controller learnt in simulation fails the task on the real robot, thus resulting in an unsuccessful sim-to-real transfer. The goal of the present survey is threefold: (1) to identify the main approaches to face the RG problem in the context of robot control with RL, (2) to point out their shortcomings, and (3) to outline new potential research areas

    Machine learning for computationally efficient electrical loads estimation in consumer washing machines

    Get PDF
    Estimating the wear of the single electrical parts of a home appliance without resorting to a large number of sensors is desirable for ensuring a proper level of maintenance by the manufacturers. Deep learning techniques can be effective tools for such estimation from relatively poor measurements, but their computational demands must be carefully considered, for the actual deployment. In this work, we employ one-dimensional Convolutional Neural Networks and Long Short-Term Memory networks to infer the status of some electrical components of different models of washing machines, from the electrical signals measured at the plug. These tools are trained and tested on a large dataset (502 washing cycles 1000 h) collected from four different washing machines and are carefully designed in order to comply with the memory constraints imposed by available hardware selected for a real implementation. The approach is end-to-end; i.e., it does not require any feature extraction, except the harmonic decomposition of the electrical signals, and thus it can be easily generalized to other appliances

    Evolutionary Machine Learning in Robotics

    No full text
    In this chapter, we survey the most significant applications of EML to robotics. We first highlight the salient characteristics of the field in terms of what can be optimized and with what aims and constraints. Then we survey the large literature concerning the optimization, by the means of evolutionary computation, of artificial neural networks, traditionally considered a form of machine learning, used for controlling the robots: for easing the comprehension, we categorize the various approaches along different axes, as, e.g., the robotic task, the representation of the solutions, the evolutionary algorithm being employed. We then survey the many usages of evolutionary computation for optimizing the morphology of the robots, including those that tackle the challenging task of optimizing the morphology and the controller at the same time. Finally, we discuss the reality gap problem that consists in a potential mismatch between the quality of solutions found in simulations and their quality observed in reality

    An Online Iterative Linear Quadratic Approach for a Satisfactory Working Point Attainment at FERMI

    Get PDF
    The attainment of a satisfactory operating point is one of the main problems in the tuning of particle accelerators. These are extremely complex facilities, characterized by the absence of a model that accurately describes their dynamics, and by an often persistent noise which, along with machine drifts, affects their behaviour in unpredictable ways. In this paper, we propose an online iterative Linear Quadratic Regulator (iLQR) approach to tackle this problem on the FERMI free-electron laser of Elettra Sincrotrone Trieste. It consists of a model identification performed by a neural network trained on data collected from the real facility, followed by the application of the iLQR in a Model-Predictive Control fashion. We perform several experiments, training the neural network with increasing amount of data, in order to understand what level of model accuracy is needed to accomplish the task. We empirically show that the online iLQR results, on average, in fewer steps than a simple gradient ascent (GA), and requires a less accurate neural network to achieve the goal

    Model Predictive Control of glucose concentration based on Signal Temporal Logic specifications

    Get PDF
    Insulin is a peptide hormone produced by the pancreas to regulate the cells intake of glucose in the blood. Type 1 diabetes compromises this particular capacity of the pancreas. Patients with this disease inject insulin to regulate the level of glucose in the blood, thus reducing the risk of longterm complications. Artificial Pancreas (AP) is a wearable device developed to provide automatic delivery of insuline, allowing a potentially significant improvement in the quality of life of patients. In this paper we apply to the AP a Model Predictive Controller able to generate state trajectories that meet constraints expressed through Signal Temporal Logic (STL). Such a form of constraints is indeed appropriate for the AP, in which some requirements result in hard constraints (absolutely avoid hypoglycaemia) and some other in soft constraints (avoid a prolonged hyperglycaemia). We rely on the BluSTL toolbox, which allows to automatically generate controllers using STL specifications. We perform simulations on two different scenarios: an MPC controller that uses the same constraints as [1] and an MPC-STL controller in both deterministic and adversarial environment (robust control). We show that the soft constraints permitted by STL avoid unnecessary restriction, providing safe trajectories in correspondence of higher disturbance

    Singularity Avoidance for Cart-Mounted Hand-Guided Collaborative Robots: A Variational Approach

    No full text
    Most collaborative robots (cobots) can be taught by hand guiding: essentially, by manually jogging the robot, an operator teaches some configurations to be employed as via points. Based on those via points, Cartesian end-effector trajectories such as straight lines, circular arcs or splines are then constructed. Such methods can, in principle, be employed for cart-mounted cobots (i.e., when the jogging involves one or two linear axes, besides the cobot axes). However, in some applications, the sole imposition of via points in Cartesian space is not sufficient. On the contrary, albeit the overall system is redundant, (i) the via points must be reached at the taught joint configurations, and (ii) the undesirable singularity (and near-singularity) conditions must be avoided. The naive approach, consisting of setting the cart trajectory beforehand (for instance, by imposing a linear-in-time motion law that crosses the taught cart configurations), satisfies the first need, but does not guarantee the satisfaction of the second. Here, we propose an approach consisting of (i) a novel strategy for decoupling the planning of the cart trajectory and that of the robot joints, and (ii) a novel variational technique for computing the former in a singularity-aware fashion, ensuring the avoidance of a class of workspace singularity and near-singularity configurations

    Closed-loop Control from Data-Driven Open-Loop Optimal Control Trajectories

    No full text
    We show how the recent works on data driven open-loop minimum-energy control for linear systems can be exploited to obtain closed-loop piecewise-affine control laws, by employing a state-space partitioning technique which is at the basis of the static relatively optimal control. In addition, we propose a way for employing portions of the experimental input and state trajectories to recover information about the natural movement of the state and dealing with non-zero initial conditions. The same idea can be used for formulating several open-loop control problems entirely based on data, possibly including input and state constraints
    corecore