3,151 research outputs found

    Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing

    Full text link
    Within the context of autonomous driving a model-based reinforcement learning algorithm is proposed for the design of neural network-parameterized controllers. Classical model-based control methods, which include sampling- and lattice-based algorithms and model predictive control, suffer from the trade-off between model complexity and computational burden required for the online solution of expensive optimization or search problems at every short sampling time. To circumvent this trade-off, a 2-step procedure is motivated: first learning of a controller during offline training based on an arbitrarily complicated mathematical system model, before online fast feedforward evaluation of the trained controller. The contribution of this paper is the proposition of a simple gradient-free and model-based algorithm for deep reinforcement learning using task separation with hill climbing (TSHC). In particular, (i) simultaneous training on separate deterministic tasks with the purpose of encoding many motion primitives in a neural network, and (ii) the employment of maximally sparse rewards in combination with virtual velocity constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl

    Unmanned Aerial Systems for Wildland and Forest Fires

    Full text link
    Wildfires represent an important natural risk causing economic losses, human death and important environmental damage. In recent years, we witness an increase in fire intensity and frequency. Research has been conducted towards the development of dedicated solutions for wildland and forest fire assistance and fighting. Systems were proposed for the remote detection and tracking of fires. These systems have shown improvements in the area of efficient data collection and fire characterization within small scale environments. However, wildfires cover large areas making some of the proposed ground-based systems unsuitable for optimal coverage. To tackle this limitation, Unmanned Aerial Systems (UAS) were proposed. UAS have proven to be useful due to their maneuverability, allowing for the implementation of remote sensing, allocation strategies and task planning. They can provide a low-cost alternative for the prevention, detection and real-time support of firefighting. In this paper we review previous work related to the use of UAS in wildfires. Onboard sensor instruments, fire perception algorithms and coordination strategies are considered. In addition, we present some of the recent frameworks proposing the use of both aerial vehicles and Unmanned Ground Vehicles (UV) for a more efficient wildland firefighting strategy at a larger scale.Comment: A recent published version of this paper is available at: https://doi.org/10.3390/drones501001

    Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning

    Full text link
    The physical design of a robot and the policy that controls its motion are inherently coupled, and should be determined according to the task and environment. In an increasing number of applications, data-driven and learning-based approaches, such as deep reinforcement learning, have proven effective at designing control policies. For most tasks, the only way to evaluate a physical design with respect to such control policies is empirical--i.e., by picking a design and training a control policy for it. Since training these policies is time-consuming, it is computationally infeasible to train separate policies for all possible designs as a means to identify the best one. In this work, we address this limitation by introducing a method that performs simultaneous joint optimization of the physical design and control network. Our approach maintains a distribution over designs and uses reinforcement learning to optimize a control policy to maximize expected reward over the design distribution. We give the controller access to design parameters to allow it to tailor its policy to each design in the distribution. Throughout training, we shift the distribution towards higher-performing designs, eventually converging to a design and control policy that are jointly optimal. We evaluate our approach in the context of legged locomotion, and demonstrate that it discovers novel designs and walking gaits, outperforming baselines in both performance and efficiency

    A regressive machine-learning approach to the non-linear complex FAST model for hybrid floating offshore wind turbines with integrated oscillating water columns

    Get PDF
    Offshore wind energy is getting increasing attention as a clean alternative to the currently scarce fossil fuels mainly used in Europe's electricity supply. The further development and implementation of this kind of technology will help fighting global warming, allowing a more sustainable and decarbonized power generation. In this sense, the integration of Floating Offshore Wind Turbines (FOWTs) with Oscillating Water Columns (OWCs) devices arise as a promising solution for hybrid renewable energy production. In these systems, OWC modules are employed not only for wave energy generation but also for FOWTs stabilization and cost-efficiency. Nevertheless, analyzing and understanding the aero-hydro-servo-elastic floating structure control performance composes an intricate and challenging task. Even more, given the dynamical complexity increase that involves the incorporation of OWCs within the FOWT platform. In this regard, although some time and frequency domain models have been developed, they are complex, computationally inefficient and not suitable for neither real-time nor feedback control. In this context, this work presents a novel control-oriented regressive model for hybrid FOWT-OWCs platforms. The main objective is to take advantage of the predictive and forecasting capabilities of the deep-layered artificial neural networks (ANNs), jointly with their computational simplicity, to develop a feasible control-oriented and lightweight model compared to the aforementioned complex dynamical models. In order to achieve this objective, a deep-layered ANN model has been designed and trained to match the hybrid platform's structural performance. Then, the obtained scheme has been benchmarked against standard Multisurf-Wamit-FAST 5MW FOWT output data for different challenging scenarios in order to validate the model. The results demonstrate the adequate performance and accuracy of the proposed ANN control-oriented model, providing a great alternative for complex non-linear models traditionally used and allowing the implementation of advanced control schemes in a computationally convenient, straightforward, and easy way.This work was supported in part by the Basque Government through project IT1555-22 and through the projects PID2021-123543OB-C21 and PID2021-123543OB-C22 (MCIN/AEI/10.13039/501100011033/FEDER, UE). The authors would also like to thank the UPV/EHU for the financial support through the María Zambrano grant MAZAM22/15 and Margarita Salas grant MARSA22/09 (UPV-EHU/MIU/Next Generation, EU) and through grant PIF20/299 (UPV/EHU)

    Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems

    Full text link
    Majority of Artificial Neural Network (ANN) implementations in autonomous systems use a fixed/user-prescribed network topology, leading to sub-optimal performance and low portability. The existing neuro-evolution of augmenting topology or NEAT paradigm offers a powerful alternative by allowing the network topology and the connection weights to be simultaneously optimized through an evolutionary process. However, most NEAT implementations allow the consideration of only a single objective. There also persists the question of how to tractably introduce topological diversification that mitigates overfitting to training scenarios. To address these gaps, this paper develops a multi-objective neuro-evolution algorithm. While adopting the basic elements of NEAT, important modifications are made to the selection, speciation, and mutation processes. With the backdrop of small-robot path-planning applications, an experience-gain criterion is derived to encapsulate the amount of diverse local environment encountered by the system. This criterion facilitates the evolution of genes that support exploration, thereby seeking to generalize from a smaller set of mission scenarios than possible with performance maximization alone. The effectiveness of the single-objective (optimizing performance) and the multi-objective (optimizing performance and experience-gain) neuro-evolution approaches are evaluated on two different small-robot cases, with ANNs obtained by the multi-objective optimization observed to provide superior performance in unseen scenarios
    • …