3,151 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Unmanned Aerial Systems for Wildland and Forest Fires
Wildfires represent an important natural risk causing economic losses, human
death and important environmental damage. In recent years, we witness an
increase in fire intensity and frequency. Research has been conducted towards
the development of dedicated solutions for wildland and forest fire assistance
and fighting. Systems were proposed for the remote detection and tracking of
fires. These systems have shown improvements in the area of efficient data
collection and fire characterization within small scale environments. However,
wildfires cover large areas making some of the proposed ground-based systems
unsuitable for optimal coverage. To tackle this limitation, Unmanned Aerial
Systems (UAS) were proposed. UAS have proven to be useful due to their
maneuverability, allowing for the implementation of remote sensing, allocation
strategies and task planning. They can provide a low-cost alternative for the
prevention, detection and real-time support of firefighting. In this paper we
review previous work related to the use of UAS in wildfires. Onboard sensor
instruments, fire perception algorithms and coordination strategies are
considered. In addition, we present some of the recent frameworks proposing the
use of both aerial vehicles and Unmanned Ground Vehicles (UV) for a more
efficient wildland firefighting strategy at a larger scale.Comment: A recent published version of this paper is available at:
https://doi.org/10.3390/drones501001
Jointly Learning to Construct and Control Agents using Deep Reinforcement Learning
The physical design of a robot and the policy that controls its motion are
inherently coupled, and should be determined according to the task and
environment. In an increasing number of applications, data-driven and
learning-based approaches, such as deep reinforcement learning, have proven
effective at designing control policies. For most tasks, the only way to
evaluate a physical design with respect to such control policies is
empirical--i.e., by picking a design and training a control policy for it.
Since training these policies is time-consuming, it is computationally
infeasible to train separate policies for all possible designs as a means to
identify the best one. In this work, we address this limitation by introducing
a method that performs simultaneous joint optimization of the physical design
and control network. Our approach maintains a distribution over designs and
uses reinforcement learning to optimize a control policy to maximize expected
reward over the design distribution. We give the controller access to design
parameters to allow it to tailor its policy to each design in the distribution.
Throughout training, we shift the distribution towards higher-performing
designs, eventually converging to a design and control policy that are jointly
optimal. We evaluate our approach in the context of legged locomotion, and
demonstrate that it discovers novel designs and walking gaits, outperforming
baselines in both performance and efficiency
A regressive machine-learning approach to the non-linear complex FAST model for hybrid floating offshore wind turbines with integrated oscillating water columns
Offshore wind energy is getting increasing attention as a clean alternative to the currently scarce fossil fuels mainly used in Europe's electricity supply. The further development and implementation of this kind of technology will help fighting global warming, allowing a more sustainable and decarbonized power generation. In this sense, the integration of Floating Offshore Wind Turbines (FOWTs) with Oscillating Water Columns (OWCs) devices arise as a promising solution for hybrid renewable energy production. In these systems, OWC modules are employed not only for wave energy generation but also for FOWTs stabilization and cost-efficiency. Nevertheless, analyzing and understanding the aero-hydro-servo-elastic floating structure control performance composes an intricate and challenging task. Even more, given the dynamical complexity increase that involves the incorporation of OWCs within the FOWT platform. In this regard, although some time and frequency domain models have been developed, they are complex, computationally inefficient and not suitable for neither real-time nor feedback control. In this context, this work presents a novel control-oriented regressive model for hybrid FOWT-OWCs platforms. The main objective is to take advantage of the predictive and forecasting capabilities of the deep-layered artificial neural networks (ANNs), jointly with their computational simplicity, to develop a feasible control-oriented and lightweight model compared to the aforementioned complex dynamical models. In order to achieve this objective, a deep-layered ANN model has been designed and trained to match the hybrid platform's structural performance. Then, the obtained scheme has been benchmarked against standard Multisurf-Wamit-FAST 5MW FOWT output data for different challenging scenarios in order to validate the model. The results demonstrate the adequate performance and accuracy of the proposed ANN control-oriented model, providing a great alternative for complex non-linear models traditionally used and allowing the implementation of advanced control schemes in a computationally convenient, straightforward, and easy way.This work was supported in part by the Basque Government through project IT1555-22 and through the projects PID2021-123543OB-C21 and PID2021-123543OB-C22 (MCIN/AEI/10.13039/501100011033/FEDER, UE). The authors would also like to thank the UPV/EHU for the financial support through the MarÃa Zambrano grant MAZAM22/15 and Margarita Salas grant MARSA22/09 (UPV-EHU/MIU/Next Generation, EU) and through grant PIF20/299 (UPV/EHU)
Multi-criteria Evolution of Neural Network Topologies: Balancing Experience and Performance in Autonomous Systems
Majority of Artificial Neural Network (ANN) implementations in autonomous
systems use a fixed/user-prescribed network topology, leading to sub-optimal
performance and low portability. The existing neuro-evolution of augmenting
topology or NEAT paradigm offers a powerful alternative by allowing the network
topology and the connection weights to be simultaneously optimized through an
evolutionary process. However, most NEAT implementations allow the
consideration of only a single objective. There also persists the question of
how to tractably introduce topological diversification that mitigates
overfitting to training scenarios. To address these gaps, this paper develops a
multi-objective neuro-evolution algorithm. While adopting the basic elements of
NEAT, important modifications are made to the selection, speciation, and
mutation processes. With the backdrop of small-robot path-planning
applications, an experience-gain criterion is derived to encapsulate the amount
of diverse local environment encountered by the system. This criterion
facilitates the evolution of genes that support exploration, thereby seeking to
generalize from a smaller set of mission scenarios than possible with
performance maximization alone. The effectiveness of the single-objective
(optimizing performance) and the multi-objective (optimizing performance and
experience-gain) neuro-evolution approaches are evaluated on two different
small-robot cases, with ANNs obtained by the multi-objective optimization
observed to provide superior performance in unseen scenarios
- …