248 research outputs found

    Feedback control by online learning an inverse model

    Get PDF
    A model, predictor, or error estimator is often used by a feedback controller to control a plant. Creating such a model is difficult when the plant exhibits nonlinear behavior. In this paper, a novel online learning control framework is proposed that does not require explicit knowledge about the plant. This framework uses two learning modules, one for creating an inverse model, and the other for actually controlling the plant. Except for their inputs, they are identical. The inverse model learns by the exploration performed by the not yet fully trained controller, while the actual controller is based on the currently learned model. The proposed framework allows fast online learning of an accurate controller. The controller can be applied on a broad range of tasks with different dynamic characteristics. We validate this claim by applying our control framework on several control tasks: 1) the heating tank problem (slow nonlinear dynamics); 2) flight pitch control (slow linear dynamics); and 3) the balancing problem of a double inverted pendulum (fast linear and nonlinear dynamics). The results of these experiments show that fast learning and accurate control can be achieved. Furthermore, a comparison is made with some classical control approaches, and observations concerning convergence and stability are made

    Comparison of Modern Controls and Reinforcement Learning for Robust Control of Autonomously Backing Up Tractor-Trailers to Loading Docks

    Get PDF
    Two controller performances are assessed for generalization in the path following task of autonomously backing up a tractor-trailer. Starting from random locations and orientations, paths are generated to loading docks with arbitrary pose using Dubins Curves. The combination vehicles can be varied in wheelbase, hitch length, weight distributions, and tire cornering stiffness. The closed form calculation of the gains for the Linear Quadratic Regulator (LQR) rely heavily on having an accurate model of the plant. However, real-world applications cannot expect to have an updated model for each new trailer. Finding alternative robust controllers when the trailer model is changed was the motivation of this research. Reinforcement learning, with neural networks as their function approximators, can allow for generalized control from its learned experience that is characterized by a scalar reward value. The Linear Quadratic Regulator and the Deep Deterministic Policy Gradient (DDPG) are compared for robust control when the trailer is changed. This investigation quantifies the capabilities and limitations of both controllers in simulation using a kinematic model. The controllers are evaluated for generalization by altering the kinematic model trailer wheelbase, hitch length, and velocity from the nominal case. In order to close the gap from simulation and reality, the control methods are also assessed with sensor noise and various controller frequencies. The root mean squared and maximum errors from the path are used as metrics, including the number of times the controllers cause the vehicle to jackknife or reach the goal. Considering the runs where the LQR did not cause the trailer to jackknife, the LQR tended to have slightly better precision. DDPG, however, controlled the trailer successfully on the paths where the LQR jackknifed. Reinforcement learning was found to sacrifice a short term reward, such as precision, to maximize the future expected reward like reaching the loading dock. The reinforcement learning agent learned a policy that imposed nonlinear constraints such that it never jackknifed, even when it wasn\u27t the trailer it trained on

    Evolutionary control of autonomous underwater vehicles

    Get PDF
    The goal of Evolutionary Robotics (ER) is the development of automatic processes for the synthesis of robot control systems using evolutionary computation. The idea that it may be possible to synthesise robotic control systems using an automatic design process is appealing. However, ER is considerably more challenging and less automatic than its advocates would suggest. ER applies methods from the field of neuroevolution to evolve robot control systems. Neuroevolution is a machine learning algorithm that applies evolutionary computation to the design of Artificial Neural Networks (ANN). The aim of this thesis is to assay the practical characteristics of neuroevolution by performing bulk experiments on a set of Reinforcement Learning (RL) problems. This thesis was conducted with the view of applying neuroevolution to the design of neurocontrollers for small low-cost Autonomous Underwater Vehicles (AUV). A general approach to neuroevolution for RL problems is presented. The is selected to evolve ANN connection weights on the basis that it has shown competitive performance on continuous optimisation problems, is self-adaptive and can exploit dependencies between connection weights. Practical implementation issues are identified and discussed. A series of experiments are conducted on RL problems. These problems are representative of problems from the AUV domain, but manageable in terms of problem complexity and computational resources required. Results from these experiments are analysed to draw out practical characteristics of neuroevolution. Bulk experiments are conducted using the inverted pendulum problem. This popular control benchmark is inherently unstable, underactuated and non-linear: characteristics common to underwater vehicles. Two practical characteristics of neuroevolution are demonstrated: the importance of using randomly generated evaluation sets and the effect of evaluation noise on search performance. As part of these experiments, deficiencies in the benchmark are identified and modifications suggested. The problem of an underwater vehicle travelling to a goal in an obstacle free environment is studied. The vehicle is modelled as a Dubins car, which is a simplified model of the high-level kinematics of a torpedo class underwater vehicle. Two practical characteristics of neuroevolution are demonstrated: the importance of domain knowledge when formulating ANN inputs and how the fitness function defines the set of evolvable control policies. Paths generated by the evolved neurocontrollers are compared with known optimal solutions. A framework is presented to guide the practical application of neuroevolution to RL problems that covers a range of issues identified during the experiments conducted in this thesis. An assessment of neuroevolution concludes that it is far from automatic yet still has potential as a technique for solving reinforcement problems, although further research is required to better understand the process of evolutionary learning. The major contribution made by this thesis is a rigorous empirical study of the practical characteristics of neuroevolution as applied to RL problems. A critical, yet constructive, viewpoint is taken of neuroevolution. This viewpoint differs from much of the reseach undertaken in this field, which is often unjustifiably optimistic and tends to gloss over difficult practical issues

    Feedback Control by Online Learning an Inverse Model

    Full text link
    • …
    corecore