127 research outputs found

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Learning Control of Robotic Arm Using Deep Q-Neural Network

    Get PDF
    Enabling robotic systems for autonomous actions such as driverless systems, is a very complex task in real-world scenarios due to uncertainties. Machine learning capabilities have been quickly making their way into autonomous systems and industrial robotics technology. They found many applications in every sector, including autonomous vehicles, humanoid robots, drones and many more. In this research we will be implementing artificial intelligence in robotic arm to be able to solve a complex balancing control problem from scratch, without any feedback loop and using state of the art deep reinforcement learning algorithm named DQN. The benchmark problem that is considered as case study, is balancing an inverted pendulum upward using a six-degrees freedom robot arm. Very simple form of this problem has been solved recently using machine learning however under this thesis we made a very complex system of inverted pendulum and implemented in Robot Operating System (ROS) which is very realistic simulation environment. We have not only succeeded to control the pendulum but also added turbulences on the learned model to study its robustness. We observed how the initial learned model is unstable at the presence of turbulence and how random turbulences helps the system to transform to a more robust model. We have also used the robust model in different environment and showed how the model adopt itself with the new physical properties. Using orientation sensor on the tip of the inverted pendulum to get angular velocity, simulation in ROS and having inverted pendulum on ball joint are few highlighted novelties in this thesis in compare previous publications

    Modeling and Control Strategies for a Two-Wheel Balancing Mobile Robot

    Get PDF
    The problem of balancing and autonomously navigating a two-wheel mobile robot is an increasingly active area of research, due to its potential applications in last-mile delivery, pedestrian transportation, warehouse automation, parts supply, agriculture, surveillance, and monitoring. This thesis investigates the design and control of a two-wheel balancing mobile robot using three different control strategies: Proportional Integral Derivative (PID) controllers, Sliding Mode Control, and Deep Q-Learning methodology. The mobile robot is modeled using a dynamic and kinematic model, and its motion is simulated in a custom MATLAB/Simulink environment. The first part of the thesis focuses on developing a dynamic and kinematic model for the mobile robot. The robot dynamics is derived using the classical Euler-Lagrange method, where motion can be described using potential and kinetic energies of the bodies. Non-holonomic constraints are included in the model to achieve desired motion, such as non-drifting of the mobile robot. These non-holonomic constraints are included using the method of Lagrange multipliers. Navigation for the robot is developed using artificial potential field path planning to generate a map of velocity vectors that are used for the set points for linear velocity and yaw rate. The second part of the thesis focuses on developing and evaluating three different control strategies for the mobile robot: PID controllers, Hierarchical Sliding Mode Control, and Deep-Q-Learning. The performances of the different control strategies are evaluated and compared based on various metrics, such as stability, robustness to mass variations and disturbances, and tracking accuracy. The implementation and evaluation of these strategies are modeled tested in a MATLAB/SIMULINK virtual environment

    Modeling and Control Strategies for a Two-Wheel Balancing Mobile Robot

    Get PDF
    The problem of balancing and autonomously navigating a two-wheel mobile robot is an increasingly active area of research, due to its potential applications in last-mile delivery, pedestrian transportation, warehouse automation, parts supply, agriculture, surveillance, and monitoring. This thesis investigates the design and control of a two-wheel balancing mobile robot using three different control strategies: Proportional Integral Derivative (PID) controllers, Sliding Mode Control, and Deep Q-Learning methodology. The mobile robot is modeled using a dynamic and kinematic model, and its motion is simulated in a custom MATLAB/Simulink environment. The first part of the thesis focuses on developing a dynamic and kinematic model for the mobile robot. The robot dynamics is derived using the classical Euler-Lagrange method, where motion can be described using potential and kinetic energies of the bodies. Non-holonomic constraints are included in the model to achieve desired motion, such as non-drifting of the mobile robot. These non-holonomic constraints are included using the method of Lagrange multipliers. Navigation for the robot is developed using artificial potential field path planning to generate a map of velocity vectors that are used for the set points for linear velocity and yaw rate. The second part of the thesis focuses on developing and evaluating three different control strategies for the mobile robot: PID controllers, Hierarchical Sliding Mode Control, and Deep-Q-Learning. The performances of the different control strategies are evaluated and compared based on various metrics, such as stability, robustness to mass variations and disturbances, and tracking accuracy. The implementation and evaluation of these strategies are modeled tested in a MATLAB/SIMULINK virtual environment

    Modelagem e controle de um robô do tipo pêndulo invertido utilizando aprendizado por reforço

    Get PDF
    TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Engenharia Eletrônica.Este trabalho trata de uma pesquisa experimental de simulação virtual com um modelo robótico do tipo pêndulo invertido sobre duas rodas paralelas. O equilíbrio deste sistema compõe uma tarefa clássica para a área de controle. Neste sistema, devido as propriedades como não linearidade e instabilidade, existem dificuldades para a aplicação de diversos algoritmos de controle clássico. Neste contexto, propõe-se uma abordagem alternativa para este sistema, com controladores baseados em inteligência artificial. O robô foi modelado em espaço de estados com base em leis de Newton e circuitos elétricos, incluindo parâmetros como momentos de inércia, indutância do motor, forças de atrito e efeitos da redução mecânica. Para tornar o modelo mais realista, os parâmetros do motor foram determinados a partir de um motor real. Neste modelo, o objetivo foi controlar tanto a inclinação como a posição do robô, definida em relação ao plano em que este se desloca. Para isso, aplicou-se dois algoritmos de aprendizado por reforço, Double Deep Q-Network, que utiliza redes neurais artificiais para tratar o espaço de estados de forma contínua e Twin Delayed Deep Deterministic Policy Gradient, que atua com estados e ações em domínio contínuo. Desta forma, solucionou-se a tarefa de controle através de aprendizado por tentativa e erro, sem necessidade de conhecimentos prévios sobre o sistema. A performance dos controladores resultantes, assim como a eficiência de seus treinamentos, foi avaliada com múltiplas simulações. Além disso, verificou-se o efeito de ruídos em medidas de sensores utilizados em aplicações práticas e a aplicação de um sinal de entrada para controlar a posição do robô. Por fim, avaliou-se efeitos de modificações na função que caracteriza o objetivo dos controladores. Assim, foi concluído que os algoritmos aplicados possuem potencial para superar controladores clássicos em diversos aspectos. Porém, estes métodos também apresentam dificuldades em suas implementações.This work presents an experimental research of virtual simulation with a robotic model of the type inverted pendulum on two parallel wheels. The balance of this system is a classic task for the control area. In this system, due to properties such as nonlinearity and instability, there are difficulties in the application of several classical control algorithms. In this context, an alternative approach to this system is proposed, with controllers based on artificial intelligence. The robot was modeled in state space based on Newton’s laws and electrical circuits, including parameters such as moments of inertia, motor inductance, friction forces and effects of mechanical reduction. To make the model more realistic, the motor parameters were derived from a real engine. In this model, the objective was to control both the inclination and the position of the robot, defined in relation to the plane in which it moves. For this, two reinforcement learning algorithms were applied, Double Deep Q-Network, which uses artificial neural networks to treat the state space continuously and Twin Delayed Deep Deterministic Policy Gradient, which works with states and actions in a continuous domain. Therefore, the control task was solved through trial and error learning, without the need for previous knowledge about the system. The performance of the resulting controllers, as well as the efficiency of their training, was evaluated with multiple simulations. In addition, the effect of noise on sensor measurements used in practical applications and the application of an input signal to control the robot’s position was verified. Finally, the effects of changes in the function that characterize the objective of the controllers were evaluated. Thus, it was concluded that the applied algorithms have the potential to overcome classical controllers in several aspects. However, these methods also present difficulties in their implementation

    Coherent Transport of Quantum States by Deep Reinforcement Learning

    Get PDF
    Some problems in physics can be handled only after a suitable \textit{ansatz }solution has been guessed. Such method is therefore resilient to generalization, resulting of limited scope. The coherent transport by adiabatic passage of a quantum state through an array of semiconductor quantum dots provides a par excellence example of such approach, where it is necessary to introduce its so called counter-intuitive control gate ansatz pulse sequence. Instead, deep reinforcement learning technique has proven to be able to solve very complex sequential decision-making problems involving competition between short-term and long-term rewards, despite a lack of prior knowledge. We show that in the above problem deep reinforcement learning discovers control sequences outperforming the \textit{ansatz} counter-intuitive sequence. Even more interesting, it discovers novel strategies when realistic disturbances affect the ideal system, with better speed and fidelity when energy detuning between the ground states of quantum dots or dephasing are added to the master equation, also mitigating the effects of losses. This method enables online update of realistic systems as the policy convergence is boosted by exploiting the prior knowledge when available. Deep reinforcement learning proves effective to control dynamics of quantum states, and more generally it applies whenever an ansatz solution is unknown or insufficient to effectively treat the problem.Comment: 5 figure

    Locomoção de humanoides robusta e versátil baseada em controlo analítico e física residual

    Get PDF
    Humanoid robots are made to resemble humans but their locomotion abilities are far from ours in terms of agility and versatility. When humans walk on complex terrains or face external disturbances, they combine a set of strategies, unconsciously and efficiently, to regain stability. This thesis tackles the problem of developing a robust omnidirectional walking framework, which is able to generate versatile and agile locomotion on complex terrains. We designed and developed model-based and model-free walk engines and formulated the controllers using different approaches including classical and optimal control schemes and validated their performance through simulations and experiments. These frameworks have hierarchical structures that are composed of several layers. These layers are composed of several modules that are connected together to fade the complexity and increase the flexibility of the proposed frameworks. Additionally, they can be easily and quickly deployed on different platforms. Besides, we believe that using machine learning on top of analytical approaches is a key to open doors for humanoid robots to step out of laboratories. We proposed a tight coupling between analytical control and deep reinforcement learning. We augmented our analytical controller with reinforcement learning modules to learn how to regulate the walk engine parameters (planners and controllers) adaptively and generate residuals to adjust the robot’s target joint positions (residual physics). The effectiveness of the proposed frameworks was demonstrated and evaluated across a set of challenging simulation scenarios. The robot was able to generalize what it learned in one scenario, by displaying human-like locomotion skills in unforeseen circumstances, even in the presence of noise and external pushes.Os robôs humanoides são feitos para se parecerem com humanos, mas suas habilidades de locomoção estão longe das nossas em termos de agilidade e versatilidade. Quando os humanos caminham em terrenos complexos ou enfrentam distúrbios externos combinam diferentes estratégias, de forma inconsciente e eficiente, para recuperar a estabilidade. Esta tese aborda o problema de desenvolver um sistema robusto para andar de forma omnidirecional, capaz de gerar uma locomoção para robôs humanoides versátil e ágil em terrenos complexos. Projetámos e desenvolvemos motores de locomoção sem modelos e baseados em modelos. Formulámos os controladores usando diferentes abordagens, incluindo esquemas de controlo clássicos e ideais, e validámos o seu desempenho por meio de simulações e experiências reais. Estes frameworks têm estruturas hierárquicas compostas por várias camadas. Essas camadas são compostas por vários módulos que são conectados entre si para diminuir a complexidade e aumentar a flexibilidade dos frameworks propostos. Adicionalmente, o sistema pode ser implementado em diferentes plataformas de forma fácil. Acreditamos que o uso de aprendizagem automática sobre abordagens analíticas é a chave para abrir as portas para robôs humanoides saírem dos laboratórios. Propusemos um forte acoplamento entre controlo analítico e aprendizagem profunda por reforço. Expandimos o nosso controlador analítico com módulos de aprendizagem por reforço para aprender como regular os parâmetros do motor de caminhada (planeadores e controladores) de forma adaptativa e gerar resíduos para ajustar as posições das juntas alvo do robô (física residual). A eficácia das estruturas propostas foi demonstrada e avaliada em um conjunto de cenários de simulação desafiadores. O robô foi capaz de generalizar o que aprendeu em um cenário, exibindo habilidades de locomoção humanas em circunstâncias imprevistas, mesmo na presença de ruído e impulsos externos.Programa Doutoral em Informátic

    An upper limb Functional Electrical Stimulation controller based on Reinforcement Learning: A feasibility case study.

    Get PDF
    Controllers for Functional Electrical Stimulation (FES) are still not able to restore natural movements in the paretic arm. In this work, Reinforcement Learning (RL) is used for the first time to control a hybrid upper limb robotic system for stroke rehabilitation in a real environment. The feasibility of the FES controller is tested on one healthy subject during elbow flex-extension in the horizontal plane. Results showed an absolute position error <1.2° for a maximum range of motion of 50°
    • …
    corecore