517 research outputs found

    Advanced Strategies for Robot Manipulators

    Get PDF
    Amongst the robotic systems, robot manipulators have proven themselves to be of increasing importance and are widely adopted to substitute for human in repetitive and/or hazardous tasks. Modern manipulators are designed complicatedly and need to do more precise, crucial and critical tasks. So, the simple traditional control methods cannot be efficient, and advanced control strategies with considering special constraints are needed to establish. In spite of the fact that groundbreaking researches have been carried out in this realm until now, there are still many novel aspects which have to be explored

    Classical and intelligent methods in model extraction and stabilization of a dual-axis reaction wheel pendulum: A comparative study

    Get PDF
    Controlling underactuated open-loop unstable systems is challenging. In this study, first, both nonlinear and linear models of a dual-axis reaction wheel pendulum (DA-RWP) are extracted by employing Lagrangian equa-tions which are based on energy methods. Then to control the system and stabilize the pendulum's angle in the upright position, fuzzy logic based controllers for both x -y directions are developed. To show the efficiency of the designed intelligent controller, comparisons are made with its classical optimal control counterparts. In our simulations, as proof of the reliability and robustness of the fuzzy controller, two scenarios including noise -disturbance-free and noisy-disturbed situations are considered. The comparisons made between the classical and fuzzy-based controllers reveal the superiority of the proposed fuzzy logic controller, in terms of time response. The simulation results of our experiments in terms of both mathematical modeling and control can be deployed as a baseline for robotics and aerospace studies as developing walking humanoid robots and satellite attitude systems, respectively.The work of U.F.-G. was supported by the government of the Basque Country for the ELKARTEK21/10 KK-2021/00014 and ELKARTEK22/85 research programs, respectively

    Bio-inspired robotic control in underactuation: principles for energy efficacy, dynamic compliance interactions and adaptability.

    Get PDF
    Biological systems achieve energy efficient and adaptive behaviours through extensive autologous and exogenous compliant interactions. Active dynamic compliances are created and enhanced from musculoskeletal system (joint-space) to external environment (task-space) amongst the underactuated motions. Underactuated systems with viscoelastic property are similar to these biological systems, in that their self-organisation and overall tasks must be achieved by coordinating the subsystems and dynamically interacting with the environment. One important question to raise is: How can we design control systems to achieve efficient locomotion, while adapt to dynamic conditions as the living systems do? In this thesis, a trajectory planning algorithm is developed for underactuated microrobotic systems with bio-inspired self-propulsion and viscoelastic property to achieve synchronized motion in an energy efficient, adaptive and analysable manner. The geometry of the state space of the systems is explicitly utilized, such that a synchronization of the generalized coordinates is achieved in terms of geometric relations along the desired motion trajectory. As a result, the internal dynamics complexity is sufficiently reduced, the dynamic couplings are explicitly characterised, and then the underactuated dynamics are projected onto a hyper-manifold. Following such a reduction and characterization, we arrive at mappings of system compliance and integrable second-order dynamics with the passive degrees of freedom. As such, the issue of trajectory planning is converted into convenient nonlinear geometric analysis and optimal trajectory parameterization. Solutions of the reduced dynamics and the geometric relations can be obtained through an optimal motion trajectory generator. Theoretical background of the proposed approach is presented with rigorous analysis and developed in detail for a particular example. Experimental studies are conducted to verify the effectiveness of the proposed method. Towards compliance interactions with the environment, accurate modelling or prediction of nonlinear friction forces is a nontrivial whilst challenging task. Frictional instabilities are typically required to be eliminated or compensated through efficiently designed controllers. In this work, a prediction and analysis framework is designed for the self-propelled vibro-driven system, whose locomotion greatly relies on the dynamic interactions with the nonlinear frictions. This thesis proposes a combined physics-based and analytical-based approach, in a manner that non-reversible characteristic for static friction, presliding as well as pure sliding regimes are revealed, and the frictional limit boundaries are identified. Nonlinear dynamic analysis and simulation results demonstrate good captions of experimentally observed frictional characteristics, quenching of friction-induced vibrations and satisfaction of energy requirements. The thesis also performs elaborative studies on trajectory tracking. Control schemes are designed and extended for a class of underactuated systems with concrete considerations on uncertainties and disturbances. They include a collocated partial feedback control scheme, and an adaptive variable structure control scheme with an elaborately designed auxiliary control variable. Generically, adaptive control schemes using neural networks are designed to ensure trajectory tracking. Theoretical background of these methods is presented with rigorous analysis and developed in detail for particular examples. The schemes promote the utilization of linear filters in the control input to improve the system robustness. Asymptotic stability and convergence of time-varying reference trajectories for the system dynamics are shown by means of Lyapunov synthesis

    Reinforcement Learning Curricula as Interpolations between Task Distributions

    Get PDF
    In the last decade, the increased availability of powerful computing machinery has led to an increasingly widespread application of machine learning methods. Machine learning has been particularly successful when large models, typically neural networks with an ever-increasing number of parameters, can leverage vast data to make predictions. While reinforcement learning (RL) has been no exception from this development, a distinguishing feature of RL is its well-known exploration-exploitation trade-off, whose optimal solution – while possible to model as a partially observable Markov decision process – evades computation in all but the simplest problems. Consequently, it seems unsurprising that notable demonstrations of reinforcement learning, such as an RL-based Go agent (AlphaGo) by Deepmind beating the professional Go player Lee Sedol, relied both on the availability of massive computing capabilities and specific forms of regularization that facilitate learning. In the case of AlphaGo, this regularization came in the form of self-play, enabling learning by interacting with gradually more proficient opponents. In this thesis, we develop techniques that, similarly to the concept of self-play of AlphaGo, improve the learning performance of RL agents by training on sequences of increasingly complex tasks. These task sequences are typically called curricula and are known to side-step problems such as slow learning or convergence to poor behavior that may occur when directly learning in complicated tasks. The algorithms we develop in this thesis create curricula by minimizing distances or divergences between probability distributions of learning tasks, generating interpolations between an initial distribution of easy learning tasks and a target task distribution. Apart from improving the learning performance of RL agents in experiments, developing methods that realize curricula as interpolations between task distributions results in a nuanced picture of key aspects of successful reinforcement learning curricula. In Chapter 1, we start this thesis by introducing required reinforcement learning notation and then motivating curriculum reinforcement learning from the perspective of continuation methods for non-linear optimization. Similar to curricula for reinforcement learning agents, continuation methods have been used in non-linear optimization to solve challenging optimization problems. This similarity provides an intuition about the effect of the curricula we aim to generate and their limits. In Chapter 2, we transfer the concept of self-paced learning, initially proposed in the supervised learning community, to the problem of RL, showing that an automated curriculum generation for RL agents can be motivated by a regularized RL objective. This regularized RL objective implies generating a curriculum as a sequence of task distributions that trade off the expected agent performance against similarity to a specified distribution of target tasks. This view on curriculum RL contrasts existing approaches, as it motivates curricula via a regularized RL objective instead of generating them from a set of assumptions about an optimal curriculum. In experiments, we show that an approximate implementation of the aforementioned curriculum – that restricts the interpolating task distribution to a Gaussian – results in improved learning performance compared to regular reinforcement learning, matching or surpassing the performance of existing curriculum-based methods. Subsequently, Chapter 3 builds up on the intuition of curricula as sequences of interpolating task distributions established in Chapter 2. Motivated by using more flexible task distribution representations, we show how parametric assumptions play a crucial role in the empirical success of the previous approach and subsequently uncover key ingredients that enable the generation of meaningful curricula without assuming a parametric model of the task distributions. One major ingredient is an explicit notion of task similarity via a distance function of two Markov Decision Processes. We turn towards optimal transport theory, allowing for flexible particle-based representations of the task distributions while properly considering the newly introduced metric structure of the task space. Combined with other improvements to our first method, such as a more aggressive restriction of the curriculum to tasks that are not too hard for the agent, the resulting approach delivers consistently high learning performance in multiple experiments. In the final Chapter 4, we apply the refined method of Chapter 3 to a trajectory-tracking task, in which we task an RL agent to follow a three-dimensional reference trajectory with the tip of an inverted pendulum mounted on a Barrett Whole Arm Manipulator. The access to only positional information results in a partially observable system that, paired with its inherent instability, underactuation, and non-trivial kinematic structure, presents a challenge for modern reinforcement learning algorithms, which we tackle via curricula. The technically infinite-dimensional task space of target trajectories allows us to probe the developed curriculum learning method for flaws that have not surfaced in the rather low-dimensional experiments of the previous chapters. Through an improved optimization scheme that better respects the non-Euclidean structure of target trajectories, we reliably generate curricula of trajectories to be tracked, resulting in faster and more robust learning compared to an RL baseline that does not exploit this form of structured learning. The learned policy matches the performance of an optimal control baseline on the real system, demonstrating the potential of curriculum RL to learn state estimation and control for non-linear tracking tasks jointly. In summary, this thesis introduces a perspective on reinforcement learning curricula as interpolations between task distributions. The methods developed under this perspective enjoy a precise formulation as optimization problems and deliver empirical benefits throughout experiments. Building upon this precise formulation may allow future work to advance the formal understanding of reinforcement learning curricula and, with that, enable the solution of challenging decision-making and control problems with reinforcement learning

    Learning-based methods for planning and control of humanoid robots

    Get PDF
    Nowadays, humans and robots are more and more likely to coexist as time goes by. The anthropomorphic nature of humanoid robots facilitates physical human-robot interaction, and makes social human-robot interaction more natural. Moreover, it makes humanoids ideal candidates for many applications related to tasks and environments designed for humans. No matter the application, an ubiquitous requirement for the humanoid is to possess proper locomotion skills. Despite long-lasting research, humanoid locomotion is still far from being a trivial task. A common approach to address humanoid locomotion consists in decomposing its complexity by means of a model-based hierarchical control architecture. To cope with computational constraints, simplified models for the humanoid are employed in some of the architectural layers. At the same time, the redundancy of the humanoid with respect to the locomotion task as well as the closeness of such a task to human locomotion suggest a data-driven approach to learn it directly from experience. This thesis investigates the application of learning-based techniques to planning and control of humanoid locomotion. In particular, both deep reinforcement learning and deep supervised learning are considered to address humanoid locomotion tasks in a crescendo of complexity. First, we employ deep reinforcement learning to study the spontaneous emergence of balancing and push recovery strategies for the humanoid, which represent essential prerequisites for more complex locomotion tasks. Then, by making use of motion capture data collected from human subjects, we employ deep supervised learning to shape the robot walking trajectories towards an improved human-likeness. The proposed approaches are validated on real and simulated humanoid robots. Specifically, on two versions of the iCub humanoid: iCub v2.7 and iCub v3

    Interval type-2 fuzzy logic control optimize by spiral dynamic algorithm for two-wheeled wheelchair

    Get PDF
    The reconfiguration of the two-wheeled wheelchair system with movable payload has been investigated within the current study towards permitting multi-task operations; through enhanced maneuverability on a flat surface under the circumstances of disturbance rejections during forward and backward motions, as well as motions on the inclined surface for uphill and downhill motions; while having height extensions of the wheelchair’s seat. The research study embarks on three objectives includes developing Interval Type-2 Fuzzy Logic Control (IT2FLC) as the control system, design a Spiral Dynamic Algorithm (SDA) for IT2FLC in stabilizing the designed double-link twowheeled wheelchair system, and optimize the input-output gains and control parameters. The two-wheeled system gives lots of benefits to the user such as less space needed to turn the wheelchair, able to move in the narrow spaces, having eye-to-eye contact with normal people, and can reach stuff on the higher shelve. However, the stability of the twowheeled system will produce high fluctuations due to the uncertainties while stabilizing the system in the upright position. Indirectly, it also caused the long travelled distance and high magnitude of tilt angle and torque. Thus, IT2FLC has been proposed as the compatible control strategy for disturbance rejections to overcome uncertainties for enhanced system stability in the upright position. Basically, IT2FLC uses a Type-2 Fuzzy Set (T2FS) and its membership function (MFs) composed of the lower MFs, upper MFs, and footprint of uncertainty (FOU). This is the reason that IT2FLC possessing the ability to handle cases of nonlinearities and uncertainties that occur in the system. Therefore, any disturbances that give at the back of the seat can be eliminated using the proposed controller, IT2FLC. Additionally, SDA implemented within the control strategy to acquire optimal values of the IT2FLC input-output control gains and parameters of its MFs further accommodated extensive fluctuations of the two-wheeled system; thus, ensuring a safe and comfortable experience among users via shorter traveled distance and lower magnitude of torques following disruptions. The two-wheeled wheelchair is designed using SimWise 4D software to subduing shortcomings of a linearized mathematical model where lengthy equation with various assumptions is required to represent the proposed system; without forgoing its nonlinearity and complexity. Moreover, a 70kg payload was also included to embody an average user, in simulating vertical extensions of the system from 0.11m to 0.25m. The completed model is then integrated with Matlab/Simulink for control design and performance evaluation through visualized simulations. The research has been compared to the previous controllers, Fuzzy Logic Control Type-1 (FLCT1), in gauging improvements and performance superiority. The significance of SDA-IT2FLC as the stability controller within the investigated system has been confirmed through current findings, which outperformed that of its predecessors (IT2FLC and FLCT1). Such results are supported through a significant reduction in traveled distance, tilt, and control torques, following a recorded 5.6% and 33.3% improvements on the stability of the system, to the performance of heuristically-tuned IT2FLC; as well as a 60% and 94% improvements in angular positions on the system, as compared to the FLCT1. Moreover, a 95.4% reduction in torques has been recorded for SDA-IT2FLC, as compared to that of FLCT1. Ultimately, SDAIT2FLC has demonstrated promising outcomes over its predecessors on maintaining the system’s stability in an upright position in terms of faster convergence and a significant reduction in traveled distance, tilt and control torques, proving itself as the robust controller for a double-link two-wheeled wheelchair with movable payload system
    • …
    corecore