61 research outputs found

    An adaptive actor-critic algorithm with multi-step simulated experiences for controlling nonholonomic mobile robots

    Get PDF
    In this paper, we propose a new algorithm of an\ud adaptive actor-critic method with multi-step simulated experiences,\ud as a kind of temporal difference (TD) method. In\ud our approach, the TD-error is composed of two valuefunctions\ud and m utility functions, where m denotes the\ud number ofmulti-steps inwhich the experience should be simulated.\ud The value-function is constructed from the critic formulated\ud by a radial basis function neural network (RBFNN),\ud which has a simulated experience as an input, generated from\ud a predictive model based on a kinematic model. Thus, since\ud our approach assumes that the model is available to simulate\ud the m-step experiences and to design a controller, such\ud a kinematic model is also applied to construct the actor and\ud the resultant model based actor (MBA) is also regarded as a\ud network, i.e., it is just viewed as a resolved velocity control\ud network. We implement this approach to control nonholonomic\ud mobile robot, especially in a trajectory tracking control\ud problem for the position coordinates and azimuth. Some\ud simulations show the effectiveness of the proposed method\ud for controlling a mobile robot with two-independent driving\ud wheels

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    RANCANG BANGUN ALAT RESUSITASI JANTUNG PARU (RJP) PNEUMATIK BERBASIS MIKROKONTROLER

    Get PDF
    Penelitian ini bertujuan untuk merancang suatu teknologi alat resusitasi jantung paru (RJP) menggunakan sistem pneumatik berbasis mikrokontroler yang digunakan untuk mengembalikan pernapasan upaya penyelamatan nyawa yang diakibatkan oleh henti jantung mendadak. Penelitian ini mengacu pada sistem perancangan, pembuatan dan perakitan dengan menggunakan Arduino Mega 2560 sebagai mikrokontroler serta menggunakan sistem pneumatik sebagai penggeraknya. Alat ini dilengkapi dengan sensor suhu MLX90614 untuk mendeteksi suhu pasien dan sensor detak jantung untuk mengukur denyut jantung pasien yang akan ditampilkan pada LCD. Metode yang digunakan pada penelitian ini adalah metode Research and Development (R&D) Borg and Gall. Tahapan yang digunakan dalam penelitian ini hanya menggunakan 7 tahapan, yaitu tahap pencarian dan pengumpulan data, tahap perancangan, tahap pengembangan produk, tahap pengujian, tahap revisi produk, tahap pengujian produk dan revisi operasional produk. Hasil penelitian menunjukkan bahwa sistem yang dibuat mampu melakukan resusitasi jantung paru dengan menampilkan data suhu pasien dan detak jantung pasien pada LCD. Alat ini dapat melakukan kompresi dada 100x/menit dengan kedalaman kompresi 5 cm. Kinerja sensor suhu MLX90614 dapat mendeteksi suhu pasien dengan persentase error sebesar 0,008%. Kinerja sensor detak jantung dapat mengukur detak jantung pasien memiliki persentase error sebesar 0,014%. ***** This study aims to design a cardiopulmonary resuscitation (CPR) technology using a microcontroller-based pneumatic system that is used to restore breathing in life-saving efforts caused by sudden cardiac arrest. This study refers to the system design, manufacture and assembly using the Arduino Mega 2560 as a microcontroller and using a pneumatic system as the driving force. This tool is equipped with a temperature sensor MLX90614 to detect the patient's temperature and a heart rate sensor to measure the patient's heart rate which will be displayed on the LCD. The method used in this study is the Borg and Gall Research and Development (R&D) method. The stages used in this study only used 7 stages, namely the search and data collection stage, the design stage, the product development stage, the testing stage, the product revision stage, the product testing stage and product operational revision. The results showed that the system created was capable of performing cardiopulmonary resuscitation by displaying the patient's temperature and heart rate data on the LCD. This machine can perform chest compressions 100x/minute with a compression depth of 5 cm. The performance of the MLX90614 temperature sensor can detect patient temperature with an error percentage of 0.008%. The performance of the heart rate sensor can measure the patient's heart rate with an error percentage of 0.014%

    Physics-based Machine Learning Methods for Control and Sensing in Fish-like Robots

    Get PDF
    Underwater robots are important for the construction and maintenance of underwater infrastructure, underwater resource extraction, and defense. However, they currently fall far behind biological swimmers such as fish in agility, efficiency, and sensing capabilities. As a result, mimicking the capabilities of biological swimmers has become an area of significant research interest. In this work, we focus specifically on improving the control and sensing capabilities of fish-like robots. Our control work focuses on using the Chaplygin sleigh, a two-dimensional nonholonomic system which has been used to model fish-like swimming, as part of a curriculum to train a reinforcement learning agent to control a fish-like robot to track a prescribed path. The agent is first trained on the Chaplygin sleigh model, which is not an accurate model of the swimming robot but crucially has similar physics; having learned these physics, the agent is then trained on a simulated swimming robot, resulting in faster convergence compared to only training on the simulated swimming robot. Our sensing work separately considers using kinematic data (proprioceptive sensing) and using surface pressure sensors. The effect of a swimming body\u27s internal dynamics on proprioceptive sensing is investigated by collecting time series of kinematic data of both a flexible and rigid body in a water tunnel behind a moving obstacle performing different motions, and using machine learning to classify the motion of the upstream obstacle. This revealed that the flexible body could more effectively classify the motion of the obstacle, even if only one if its internal states is used. We also consider the problem of using time series data from a `lateral line\u27 of pressure sensors on a fish-like body to estimate the position of an upstream obstacle. Feature extraction from the pressure data is attempted with a state-of-the-art convolutional neural network (CNN), and this is compared with using the dominant modes of a Koopman operator constructed on the data as features. It is found that both sets of features achieve similar estimation performance using a dense neural network to perform the estimation. This highlights the potential of the Koopman modes as an interpretable alternative to CNNs for high-dimensional time series. This problem is also extended to inferring the time evolution of the flow field surrounding the body using the same surface measurements, which is performed by first estimating the dominant Koopman modes of the surrounding flow, and using those modes to perform a flow reconstruction. This strategy of mapping from surface to field modes is more interpretable than directly constructing a mapping of unsteady fluid states, and is found to be effective at reconstructing the flow. The sensing frameworks developed as a result of this work allow better awareness of obstacles and flow patterns, knowledge which can inform the generation of paths through the fluid that the developed controller can track, contributing to the autonomy of swimming robots in challenging environments

    Navigational Path Analysis of Mobile Robot in Various Environments

    Get PDF
    This dissertation describes work in the area of an autonomous mobile robot. The objective is navigation of mobile robot in a real world dynamic environment avoiding structured and unstructured obstacles either they are static or dynamic. The shapes and position of obstacles are not known to robot prior to navigation. The mobile robot has sensory recognition of specific objects in the environments. This sensory-information provides local information of robots immediate surroundings to its controllers. The information is dealt intelligently by the robot to reach the global objective (the target). Navigational paths as well as time taken during navigation by the mobile robot can be expressed as an optimisation problem and thus can be analyzed and solved using AI techniques. The optimisation of path as well as time taken is based on the kinematic stability and the intelligence of the robot controller. A successful way of structuring the navigation task deals with the issues of individual behaviour design and action coordination of the behaviours. The navigation objective is addressed using fuzzy logic, neural network, adaptive neuro-fuzzy inference system and different other AI technique.The research also addresses distributed autonomous systems using multiple robot

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

    Mobile Robots Navigation

    Get PDF
    Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described
    corecore