199 research outputs found

    Improving the energy efficiency of autonomous underwater vehicles by learning to model disturbances

    Get PDF
    Energy efficiency is one of the main challenges for long-term autonomy of AUVs (Autonomous Underwater Vehicles). We propose a novel approach for improving the energy efficiency of AUV controllers based on the ability to learn which external disturbances can safely be ignored. The proposed learning approach uses adaptive oscillators that are able to learn online the frequency, amplitude and phase of zero-mean periodic external disturbances. Such disturbances occur naturally in open water due to waves, currents, and gravity, but also can be caused by the dynamics and hydrodynamics of the AUV itself. We formulate the theoretical basis of the approach, and demonstrate its abilities on a number of input signals. Further experimental evaluation is conducted using a dynamic model of the Girona 500 AUV in simulation on two important underwater scenarios: hovering and trajectory tracking. The proposed approach shows significant energy-saving capabilities while at the same time maintaining high controller gains. The approach is generic and applicable not only for AUV control, but also for other type of control where periodic disturbances exist and could be accounted for by the controller. © 2013 IEEE

    Encoderless position control of a two-link robot manipulator

    Get PDF

    Eligibility Propagation to Speed up Time Hopping for Reinforcement Learning

    Full text link
    A mechanism called Eligibility Propagation is proposed to speed up the Time Hopping technique used for faster Reinforcement Learning in simulations. Eligibility Propagation provides for Time Hopping similar abilities to what eligibility traces provide for conventional Reinforcement Learning. It propagates values from one state to all of its temporal predecessors using a state transitions graph. Experiments on a simulated biped crawling robot confirm that Eligibility Propagation accelerates the learning process more than 3 times.Comment: 7 page

    Simultaneous discovery of multiple alternative optimal policies by reinforcement learning

    No full text
    Conventional reinforcement learning algorithms for direct policy search are limited to finding only a single optimal policy. This is caused by their local-search nature, which allows them to converge only to a single local optimum in policy space, and makes them heavily dependent on the policy initialization. In this paper, we propose a novel reinforcement learning algorithm for direct policy search, which is capable of simultaneously finding multiple alternative optimal policies. The algorithm is based on particle filtering and performs global search in policy space, therefore eliminating the dependency on the policy initialization, and having the ability to find the globally optimal policy. We validate the approach on one-and two-dimensional problems with multiple optima, and compare its performance to a global random sampling method, and a state-of-the-art Expectation-Maximization based reinforcement learning algorithm. © 2012 IEEE

    Towards improved AUV control through learning of periodic signals

    No full text
    Designing a high-performance controller for an Autonomous Underwater Vehicle (AUV) is a challenging task. There are often numerous requirements, sometimes contradicting, such as speed, precision, robustness, and energy-efficiency. In this paper, we propose a theoretical concept for improving the performance of AUV controllers based on the ability to learn periodic signals. The proposed learning approach is based on adaptive oscillators that are able to learn online the frequency, amplitude and phase of zero-mean periodic signals. Such signals occur naturally in open water due to waves, currents, and gravity, but can also be caused by the dynamics and hydrodynamics of the AUV itself. We formulate the theoretical basis of the approach, and demonstrate its abilities on synthetic input signals. Further evaluation is conducted in simulation with a dynamic model of the Girona 500 AUV on a hovering task

    Direct policy search reinforcement learning based on particle filtering

    No full text
    We reveal a link between particle filtering methods and direct policy search reinforcement learning, and propose a novel reinforcement learning algorithm, based heavily on ideas borrowed from particle filters. A major advantage of the proposed algorithm is its ability to perform global search in policy space and thus find the globally optimal policy. We validate the approach on one- and two-dimensional problems with multiple optima, and compare its performance to a global random sampling method, and a state-of-the-art ExpectationMaximization based reinforcement learning algorithm

    Exploring Restart Distributions

    Get PDF
    We consider the generic approach of using an experience memory to help exploration by adapting a restart distribution. That is, given the capacity to reset the state with those corresponding to the agent's past observations, we help exploration by promoting faster state-space coverage via restarting the agent from a more diverse set of initial states, as well as allowing it to restart in states associated with significant past experiences. This approach is compatible with both on-policy and off-policy methods. However, a caveat is that altering the distribution of initial states could change the optimal policies when searching within a restricted class of policies. To reduce this unsought learning bias, we evaluate our approach in deep reinforcement learning which benefits from the high representational capacity of deep neural networks. We instantiate three variants of our approach, each inspired by an idea in the context of experience replay. Using these variants, we show that performance gains can be achieved, especially in hard exploration problems.Comment: RLDM 201
    • …
    corecore