127 research outputs found
Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems
This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv
Learning Control of Robotic Arm Using Deep Q-Neural Network
Enabling robotic systems for autonomous actions such as driverless systems, is a very complex task in real-world scenarios due to uncertainties. Machine learning capabilities have been quickly making their way into autonomous systems and industrial robotics technology. They found many applications in every sector, including autonomous vehicles, humanoid robots, drones and many more. In this research we will be implementing artificial intelligence in robotic arm to be able to solve a complex balancing control problem from scratch, without any feedback loop and using state of the art deep reinforcement learning algorithm named DQN. The benchmark problem that is considered as case study, is balancing an inverted pendulum upward using a six-degrees freedom robot arm. Very simple form of this problem has been solved recently using machine learning however under this thesis we made a very complex system of inverted pendulum and implemented in Robot Operating System (ROS) which is very realistic simulation environment. We have not only succeeded to control the pendulum but also added turbulences on the learned model to study its robustness. We observed how the initial learned model is unstable at the presence of turbulence and how random turbulences helps the system to transform to a more robust model. We have also used the robust model in different environment and showed how the model adopt itself with the new physical properties. Using orientation sensor on the tip of the inverted pendulum to get angular velocity, simulation in ROS and having inverted pendulum on ball joint are few highlighted novelties in this thesis in compare previous publications
Modeling and Control Strategies for a Two-Wheel Balancing Mobile Robot
The problem of balancing and autonomously navigating a two-wheel mobile robot is an increasingly active area of research, due to its potential applications in last-mile delivery, pedestrian transportation, warehouse automation, parts supply, agriculture, surveillance, and monitoring. This thesis investigates the design and control of a two-wheel balancing mobile robot using three different control strategies: Proportional Integral Derivative (PID) controllers, Sliding Mode Control, and Deep Q-Learning methodology. The mobile robot is modeled using a dynamic and kinematic model, and its motion is simulated in a custom MATLAB/Simulink environment. The first part of the thesis focuses on developing a dynamic and kinematic model for the mobile robot. The robot dynamics is derived using the classical Euler-Lagrange method, where motion can be described using potential and kinetic energies of the bodies. Non-holonomic constraints are included in the model to achieve desired motion, such as non-drifting of the mobile robot. These non-holonomic constraints are included using the method of Lagrange multipliers. Navigation for the robot is developed using artificial potential field path planning to generate a map of velocity vectors that are used for the set points for linear velocity and yaw rate. The second part of the thesis focuses on developing and evaluating three different control strategies for the mobile robot: PID controllers, Hierarchical Sliding Mode Control, and Deep-Q-Learning. The performances of the different control strategies are evaluated and compared based on various metrics, such as stability, robustness to mass variations and disturbances, and tracking accuracy. The implementation and evaluation of these strategies are modeled tested in a MATLAB/SIMULINK virtual environment
Modeling and Control Strategies for a Two-Wheel Balancing Mobile Robot
The problem of balancing and autonomously navigating a two-wheel mobile robot is an increasingly active area of research, due to its potential applications in last-mile delivery, pedestrian transportation, warehouse automation, parts supply, agriculture, surveillance, and monitoring. This thesis investigates the design and control of a two-wheel balancing mobile robot using three different control strategies: Proportional Integral Derivative (PID) controllers, Sliding Mode Control, and Deep Q-Learning methodology. The mobile robot is modeled using a dynamic and kinematic model, and its motion is simulated in a custom MATLAB/Simulink environment. The first part of the thesis focuses on developing a dynamic and kinematic model for the mobile robot. The robot dynamics is derived using the classical Euler-Lagrange method, where motion can be described using potential and kinetic energies of the bodies. Non-holonomic constraints are included in the model to achieve desired motion, such as non-drifting of the mobile robot. These non-holonomic constraints are included using the method of Lagrange multipliers. Navigation for the robot is developed using artificial potential field path planning to generate a map of velocity vectors that are used for the set points for linear velocity and yaw rate. The second part of the thesis focuses on developing and evaluating three different control strategies for the mobile robot: PID controllers, Hierarchical Sliding Mode Control, and Deep-Q-Learning. The performances of the different control strategies are evaluated and compared based on various metrics, such as stability, robustness to mass variations and disturbances, and tracking accuracy. The implementation and evaluation of these strategies are modeled tested in a MATLAB/SIMULINK virtual environment
Modelagem e controle de um robô do tipo pêndulo invertido utilizando aprendizado por reforço
TCC(graduação) - Universidade Federal de Santa Catarina. Centro TecnolĂłgico. Engenharia EletrĂ´nica.Este trabalho trata de uma pesquisa experimental de simulação virtual com um modelo robĂłtico do tipo pĂŞndulo invertido sobre duas rodas paralelas. O equilĂbrio deste sistema compõe uma tarefa clássica para a área de controle. Neste sistema, devido as propriedades como nĂŁo linearidade e instabilidade, existem dificuldades para a aplicação de diversos algoritmos de controle clássico. Neste contexto, propõe-se uma abordagem alternativa para este sistema, com controladores baseados em inteligĂŞncia artificial. O robĂ´ foi modelado em espaço de estados com base em leis de Newton e circuitos elĂ©tricos, incluindo parâmetros como momentos de inĂ©rcia, indutância do motor, forças de atrito e efeitos da redução mecânica. Para tornar o modelo mais realista, os parâmetros do motor foram determinados a partir de um motor real. Neste modelo, o objetivo foi controlar tanto a inclinação como a posição do robĂ´, definida em relação ao plano em que este se desloca. Para isso, aplicou-se dois algoritmos de aprendizado por reforço, Double Deep Q-Network, que utiliza redes neurais artificiais para tratar o espaço de estados de forma contĂnua e Twin Delayed Deep Deterministic Policy Gradient, que atua com estados e ações em domĂnio contĂnuo. Desta forma, solucionou-se a tarefa de controle atravĂ©s de aprendizado por tentativa e erro, sem necessidade de conhecimentos prĂ©vios sobre o sistema. A performance dos controladores resultantes, assim como a eficiĂŞncia de seus treinamentos, foi avaliada com mĂşltiplas simulações. AlĂ©m disso, verificou-se o efeito de ruĂdos em medidas de sensores utilizados em aplicações práticas e a aplicação de um sinal de entrada para controlar a posição do robĂ´. Por fim, avaliou-se efeitos de modificações na função que caracteriza o objetivo dos controladores. Assim, foi concluĂdo que os algoritmos aplicados possuem potencial para superar controladores clássicos em diversos aspectos. PorĂ©m, estes mĂ©todos tambĂ©m apresentam dificuldades em suas implementações.This work presents an experimental research of virtual simulation with a robotic model of the type inverted pendulum on two parallel wheels. The balance of this system is a classic task for the control area. In this system, due to properties such as nonlinearity and instability, there are difficulties in the application of several classical control algorithms. In this context, an alternative approach to this system is proposed, with controllers based on artificial intelligence. The robot was modeled in state space based on Newton’s laws and electrical circuits, including parameters such as moments of inertia, motor inductance, friction forces and effects of mechanical reduction. To make the model more realistic, the motor parameters were derived from a real engine. In this model, the objective was to control both the inclination and the position of the robot, defined in relation to the plane in which it moves. For this, two reinforcement learning algorithms were applied, Double Deep Q-Network, which uses artificial neural networks to treat the state space continuously and Twin Delayed Deep Deterministic Policy Gradient, which works with states and actions in a continuous domain. Therefore, the control task was solved through trial and error learning, without the need for previous knowledge about the system. The performance of the resulting controllers, as well as the efficiency of their training, was evaluated with multiple simulations. In addition, the effect of noise on sensor measurements used in practical applications and the application of an input signal to control the robot’s position was verified. Finally, the effects of changes in the function that characterize the objective of the controllers were evaluated. Thus, it was concluded that the applied algorithms have the potential to overcome classical controllers in several aspects. However, these methods also present difficulties in their implementation
Coherent Transport of Quantum States by Deep Reinforcement Learning
Some problems in physics can be handled only after a suitable \textit{ansatz
}solution has been guessed. Such method is therefore resilient to
generalization, resulting of limited scope. The coherent transport by adiabatic
passage of a quantum state through an array of semiconductor quantum dots
provides a par excellence example of such approach, where it is necessary to
introduce its so called counter-intuitive control gate ansatz pulse sequence.
Instead, deep reinforcement learning technique has proven to be able to solve
very complex sequential decision-making problems involving competition between
short-term and long-term rewards, despite a lack of prior knowledge. We show
that in the above problem deep reinforcement learning discovers control
sequences outperforming the \textit{ansatz} counter-intuitive sequence. Even
more interesting, it discovers novel strategies when realistic disturbances
affect the ideal system, with better speed and fidelity when energy detuning
between the ground states of quantum dots or dephasing are added to the master
equation, also mitigating the effects of losses. This method enables online
update of realistic systems as the policy convergence is boosted by exploiting
the prior knowledge when available. Deep reinforcement learning proves
effective to control dynamics of quantum states, and more generally it applies
whenever an ansatz solution is unknown or insufficient to effectively treat the
problem.Comment: 5 figure
Locomoção de humanoides robusta e versátil baseada em controlo analĂtico e fĂsica residual
Humanoid robots are made to resemble humans but their locomotion
abilities are far from ours in terms of agility and versatility. When humans
walk on complex terrains or face external disturbances, they
combine a set of strategies, unconsciously and efficiently, to regain
stability. This thesis tackles the problem of developing a robust omnidirectional
walking framework, which is able to generate versatile
and agile locomotion on complex terrains. We designed and developed
model-based and model-free walk engines and formulated the
controllers using different approaches including classical and optimal
control schemes and validated their performance through simulations
and experiments. These frameworks have hierarchical structures that
are composed of several layers. These layers are composed of several
modules that are connected together to fade the complexity and
increase the flexibility of the proposed frameworks. Additionally, they
can be easily and quickly deployed on different platforms.
Besides, we believe that using machine learning on top of analytical approaches
is a key to open doors for humanoid robots to step out of laboratories.
We proposed a tight coupling between analytical control and
deep reinforcement learning. We augmented our analytical controller
with reinforcement learning modules to learn how to regulate the walk
engine parameters (planners and controllers) adaptively and generate
residuals to adjust the robot’s target joint positions (residual physics).
The effectiveness of the proposed frameworks was demonstrated and
evaluated across a set of challenging simulation scenarios. The robot
was able to generalize what it learned in one scenario, by displaying
human-like locomotion skills in unforeseen circumstances, even in the
presence of noise and external pushes.Os robĂ´s humanoides sĂŁo feitos para se parecerem com humanos,
mas suas habilidades de locomoção estão longe das nossas em termos
de agilidade e versatilidade. Quando os humanos caminham em
terrenos complexos ou enfrentam distĂşrbios externos combinam diferentes
estratégias, de forma inconsciente e eficiente, para recuperar a
estabilidade. Esta tese aborda o problema de desenvolver um sistema
robusto para andar de forma omnidirecional, capaz de gerar uma locomoção
para robôs humanoides versátil e ágil em terrenos complexos.
Projetámos e desenvolvemos motores de locomoção sem modelos e
baseados em modelos. Formulámos os controladores usando diferentes
abordagens, incluindo esquemas de controlo clássicos e ideais,
e validámos o seu desempenho por meio de simulações e experiências
reais. Estes frameworks têm estruturas hierárquicas compostas por
várias camadas. Essas camadas são compostas por vários módulos
que sĂŁo conectados entre si para diminuir a complexidade e aumentar
a flexibilidade dos frameworks propostos. Adicionalmente, o sistema
pode ser implementado em diferentes plataformas de forma fácil.
Acreditamos que o uso de aprendizagem automática sobre abordagens
analĂticas Ă© a chave para abrir as portas para robĂ´s humanoides
saĂrem dos laboratĂłrios. Propusemos um forte acoplamento entre controlo
analĂtico e aprendizagem profunda por reforço. Expandimos o
nosso controlador analĂtico com mĂłdulos de aprendizagem por reforço
para aprender como regular os parâmetros do motor de caminhada
(planeadores e controladores) de forma adaptativa e gerar resĂduos
para ajustar as posições das juntas alvo do robĂ´ (fĂsica residual). A
eficácia das estruturas propostas foi demonstrada e avaliada em um
conjunto de cenários de simulação desafiadores. O robô foi capaz de
generalizar o que aprendeu em um cenário, exibindo habilidades de
locomoção humanas em circunstâncias imprevistas, mesmo na presença
de ruĂdo e impulsos externos.Programa Doutoral em Informátic
An upper limb Functional Electrical Stimulation controller based on Reinforcement Learning: A feasibility case study.
Controllers for Functional Electrical Stimulation (FES) are still not able to restore natural movements in the paretic arm. In this work, Reinforcement Learning (RL) is used
for the first time to control a hybrid upper limb robotic system for stroke rehabilitation in a real environment. The feasibility of the FES controller is tested on one healthy subject during elbow flex-extension in the horizontal plane. Results showed an absolute position error <1.2° for a maximum range of motion of 50°
- …