195 research outputs found
Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems
This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv
A brief review of neural networks based learning and control and their applications for robots
As an imitation of the biological nervous systems, neural networks (NN), which are characterized with powerful learning ability, have been employed in a wide range of applications, such as control of complex nonlinear systems, optimization, system identification and patterns recognition etc. This article aims to bring a brief review of the state-of-art NN for the complex nonlinear systems. Recent progresses of NNs in both theoretical developments and practical applications are investigated and surveyed. Specifically, NN based robot learning and control applications were further reviewed, including NN based robot manipulator control, NN based human robot interaction and NN based behavior recognition and generation
Recommended from our members
Design of a cognitive neural predictive controller for mobile robot
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityIn this thesis, a cognitive neural predictive controller system has been designed to guide a nonholonomic wheeled mobile robot during continuous and non-continuous trajectory tracking and to navigate through static obstacles with collision-free and minimum tracking error. The structure of the controller consists of two layers; the first layer is a neural network system that controls the mobile robot actuators in order to track a desired path. The second layer of the controller is cognitive layer that collects information from the environment and plans the optimal path. In addition to this, it detects if there is any obstacle in the path so it can be avoided by re-planning the trajectory using particle swarm optimisation (PSO) technique.
Two neural networks models are used: the first model is modified Elman recurrent neural network model that describes the kinematic and dynamic model of the mobile robot and it is trained off-line and on-line stages to guarantee that the outputs of the model will accurately represent the actual outputs of the mobile robot system. The trained neural model acts as the position and orientation identifier. The second model is feedforward multi-layer perceptron neural network that describes a feedforward neural controller and it is trained off-line and its weights are adapted on-line to find the reference torques, which controls the steady-state outputs of the mobile robot system. The feedback neural controller is based on the posture neural identifier and quadratic performance index predictive optimisation algorithm for N step-ahead prediction in order to find the optimal torque action in the transient to stabilise the tracking error of the mobile robot system when the trajectory of the robot is drifted from the desired path during transient state.
Three controller methodologies were developed: the first is the feedback neural controller; the second is the nonlinear PID neural feedback controller and the third is nonlinear inverse dynamic neural feedback controller, based on the back-stepping method and Lyapunov criterion. The main advantages of the presented approaches are to plan an optimal path for itself avoiding obstructions by using intelligent (PSO) technique as well as the analytically derived control law, which has significantly high computational accuracy with predictive optimisation technique to obtain the optimal torques control action and lead to minimum tracking error of the mobile robot for different types of trajectories.
The proposed control algorithm has been applied to monitor a nonholonomic wheeled mobile robot, has demonstrated the capability of tracking different trajectories with continuous gradients (lemniscates and circular) or non-continuous gradients (square) with bounded external disturbances and static obstacles. Simulations results and experimental work showed the effectiveness of the proposed cognitive neural predictive control algorithm; this is demonstrated by the minimised tracking error to less than (1 cm) and obtained smoothness of the torque control signal less than maximum torque (0.236 N.m), especially when external disturbances are applied and navigating through static obstacles.
Results show that the five steps-ahead prediction algorithm has better performance compared to one step-ahead for all the control methodologies because of a more complex control structure and taking into account future values of the desired one, not only the current value, as with one step-ahead method. The mean-square error method is used for each component of the state error vector to compare between each of the performance control methodologies in order to give better control results
深層強化学習を用いた動的環境下における事前知識不要なロボットナビゲーションに関する研究
Tohoku University博士(工学)thesi
Adaptive and learning-based formation control of swarm robots
Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation
Mobile Robot Feature-Based SLAM Behavior Learning, and Navigation in Complex Spaces
Learning mobile robot space and navigation behavior, are essential requirements for improved navigation, in addition to gain much understanding about the navigation maps. This chapter presents mobile robots feature-based SLAM behavior learning, and navigation in complex spaces. Mobile intelligence has been based on blending a number of functionaries related to navigation, including learning SLAM map main features. To achieve this, the mobile system was built on diverse levels of intelligence, this includes principle component analysis (PCA), neuro-fuzzy (NF) learning system as a classifier, and fuzzy rule based decision system (FRD)
Tracking control of redundant mobile manipulator: An RNN based metaheuristic approach
In this paper, we propose a topology of Recurrent Neural Network (RNN) based on a metaheuristic optimization algorithm for the tracking control of mobile-manipulator while enforcing nonholonomic constraints. Traditional approaches for tracking control of mobile robots usually require the computation of Jacobian-inverse or linearization of its mathematical model. The proposed algorithm uses a nature-inspired optimization approach to directly solve the nonlinear optimization problem without any further transformation. First, we formulate the tracking control as a constrained optimization problem. The optimization problem is formulated on position-level to avoid the computationally expensive Jacobian-inversion. The nonholonomic limitation is ensured by adding equality constraints to the formulated optimization problem. We then present the Beetle Antennae Olfactory Recurrent Neural Network (BAORNN) algorithm to solve the optimization problem efficiently using very few mathematical operations. We present a theoretical analysis of the proposed algorithm and show that its computational cost is linear with respect to the degree of freedoms (DOFs), i.e., O(m). Additionally, we also prove its stability and convergence. Extensive simulation results are prepared using a simulated model of IIWA14, a 7-DOF industrial-manipulator, mounted on a differentially driven cart. Comparison results with particle swarm optimization (PSO) algorithm are also presented to prove the accuracy and numerical efficiency of the proposed controller. The results demonstrate that the proposed algorithm is several times (around 75 in the worst case) faster in execution as compared to PSO, and suitable for real-time implementation. The tracking results for three different trajectories; circular, rectangular, and rhodonea paths are presented
Learning Image-Conditioned Dynamics Models for Control of Under-actuated Legged Millirobots
Millirobots are a promising robotic platform for many applications due to
their small size and low manufacturing costs. Legged millirobots, in
particular, can provide increased mobility in complex environments and improved
scaling of obstacles. However, controlling these small, highly dynamic, and
underactuated legged systems is difficult. Hand-engineered controllers can
sometimes control these legged millirobots, but they have difficulties with
dynamic maneuvers and complex terrains. We present an approach for controlling
a real-world legged millirobot that is based on learned neural network models.
Using less than 17 minutes of data, our method can learn a predictive model of
the robot's dynamics that can enable effective gaits to be synthesized on the
fly for following user-specified waypoints on a given terrain. Furthermore, by
leveraging expressive, high-capacity neural network models, our approach allows
for these predictions to be directly conditioned on camera images, endowing the
robot with the ability to predict how different terrains might affect its
dynamics. This enables sample-efficient and effective learning for locomotion
of a dynamic legged millirobot on various terrains, including gravel, turf,
carpet, and styrofoam. Experiment videos can be found at
https://sites.google.com/view/imageconddy
- …