563 research outputs found
Fast Damage Recovery in Robotics with the T-Resilience Algorithm
Damage recovery is critical for autonomous robots that need to operate for a
long time without assistance. Most current methods are complex and costly
because they require anticipating each potential damage in order to have a
contingency plan ready. As an alternative, we introduce the T-resilience
algorithm, a new algorithm that allows robots to quickly and autonomously
discover compensatory behaviors in unanticipated situations. This algorithm
equips the robot with a self-model and discovers new behaviors by learning to
avoid those that perform differently in the self-model and in reality. Our
algorithm thus does not identify the damaged parts but it implicitly searches
for efficient behaviors that do not use them. We evaluate the T-Resilience
algorithm on a hexapod robot that needs to adapt to leg removal, broken legs
and motor failures; we compare it to stochastic local search, policy gradient
and the self-modeling algorithm proposed by Bongard et al. The behavior of the
robot is assessed on-board thanks to a RGB-D sensor and a SLAM algorithm. Using
only 25 tests on the robot and an overall running time of 20 minutes,
T-Resilience consistently leads to substantially better results than the other
approaches
MOTION CONTROL SIMULATION OF A HEXAPOD ROBOT
This thesis addresses hexapod robot motion control. Insect morphology and locomotion patterns inform the design of a robotic model, and motion control is achieved via trajectory planning and bio-inspired principles. Additionally, deep learning and multi-agent reinforcement learning are employed to train the robot motion control strategy with leg coordination achieves using a multi-agent deep reinforcement learning framework. The thesis makes the following contributions:
First, research on legged robots is synthesized, with a focus on hexapod robot motion control. Insect anatomy analysis informs the hexagonal robot body and three-joint single robotic leg design, which is assembled using SolidWorks. Different gaits are studied and compared, and robot leg kinematics are derived and experimentally verified, culminating in a three-legged gait for motion control.
Second, an animal-inspired approach employs a central pattern generator (CPG) control unit based on the Hopf oscillator, facilitating robot motion control in complex environments such as stable walking and climbing. The robot\u27s motion process is quantitatively evaluated in terms of displacement change and body pitch angle.
Third, a value function decomposition algorithm, QPLEX, is applied to hexapod robot motion control. The QPLEX architecture treats each leg as a separate agent with local control modules, that are trained using reinforcement learning. QPLEX outperforms decentralized approaches, achieving coordinated rhythmic gaits and increased robustness on uneven terrain. The significant of terrain curriculum learning is assessed, with QPLEX demonstrating superior stability and faster consequence.
The foot-end trajectory planning method enables robot motion control through inverse kinematic solutions but has limited generalization capabilities for diverse terrains. The animal-inspired CPG-based method offers a versatile control strategy but is constrained to core aspects. In contrast, the multi-agent deep reinforcement learning-based approach affords adaptable motion strategy adjustments, rendering it a superior control policy. These methods can be combined to develop a customized robot motion control policy for specific scenarios
Reinforcement Learning for Agents with Many Sensors and Actuators Acting in Categorizable Environments
In this paper, we confront the problem of applying reinforcement learning to
agents that perceive the environment through many sensors and that can perform
parallel actions using many actuators as is the case in complex autonomous
robots. We argue that reinforcement learning can only be successfully applied
to this case if strong assumptions are made on the characteristics of the
environment in which the learning is performed, so that the relevant sensor
readings and motor commands can be readily identified. The introduction of such
assumptions leads to strongly-biased learning systems that can eventually lose
the generality of traditional reinforcement-learning algorithms. In this line,
we observe that, in realistic situations, the reward received by the robot
depends only on a reduced subset of all the executed actions and that only a
reduced subset of the sensor inputs (possibly different in each situation and
for each action) are relevant to predict the reward. We formalize this property
in the so called 'categorizability assumption' and we present an algorithm that
takes advantage of the categorizability of the environment, allowing a decrease
in the learning time with respect to existing reinforcement-learning
algorithms. Results of the application of the algorithm to a couple of
simulated realistic-robotic problems (landmark-based navigation and the
six-legged robot gait generation) are reported to validate our approach and to
compare it to existing flat and generalization-based reinforcement-learning
approaches
Robots that can adapt like animals
As robots leave the controlled environments of factories to autonomously
function in more complex, natural environments, they will have to respond to
the inevitable fact that they will become damaged. However, while animals can
quickly adapt to a wide variety of injuries, current robots cannot "think
outside the box" to find a compensatory behavior when damaged: they are limited
to their pre-specified self-sensing abilities, can diagnose only anticipated
failure modes, and require a pre-programmed contingency plan for every type of
potential damage, an impracticality for complex robots. Here we introduce an
intelligent trial and error algorithm that allows robots to adapt to damage in
less than two minutes, without requiring self-diagnosis or pre-specified
contingency plans. Before deployment, a robot exploits a novel algorithm to
create a detailed map of the space of high-performing behaviors: This map
represents the robot's intuitions about what behaviors it can perform and their
value. If the robot is damaged, it uses these intuitions to guide a
trial-and-error learning algorithm that conducts intelligent experiments to
rapidly discover a compensatory behavior that works in spite of the damage.
Experiments reveal successful adaptations for a legged robot injured in five
different ways, including damaged, broken, and missing legs, and for a robotic
arm with joints broken in 14 different ways. This new technique will enable
more robust, effective, autonomous robots, and suggests principles that animals
may use to adapt to injury
Trends in the control of hexapod robots: a survey
The static stability of hexapods motivates their design for tasks in which stable locomotion is required, such as navigation across complex environments. This task is of high interest due to the possibility of replacing human beings in exploration, surveillance and rescue missions. For this application, the control system must adapt the actuation of the limbs according to their surroundings to ensure that the hexapod does not tumble during locomotion. The most traditional approach considers their limbs as robotic manipulators and relies on mechanical models to actuate them. However, the increasing interest in model-free models for the control of these systems has led to the design of novel solutions. Through a systematic literature review, this paper intends to overview the trends in this field of research and determine in which stage the design of autonomous and adaptable controllers for hexapods is.The first author received funding through a doctoral scholarship from the Portuguese Foundation for Science and Technology (FCT) (Grant No. SFRH/BD/145818/2019), with funds from the Portuguese Ministry of Science, Technology and Higher Education and the European Social Fund through the Programa Operacional Regional Norte. This work has been supported by the FCT national funds, under the national support to R&D units grant, through the reference project UIDB/04436/2020 and UIDP/04436/2020
Learning to Exploit Elastic Actuators for Quadruped Locomotion
Spring-based actuators in legged locomotion provide energy-efficiency and
improved performance, but increase the difficulty of controller design. While
previous work has focused on extensive modeling and simulation to find optimal
controllers for such systems, we propose to learn model-free controllers
directly on the real robot. In our approach, gaits are first synthesized by
central pattern generators (CPGs), whose parameters are optimized to quickly
obtain an open-loop controller that achieves efficient locomotion. Then, to
make this controller more robust and further improve the performance, we use
reinforcement learning to close the loop, to learn corrective actions on top of
the CPGs. We evaluate the proposed approach on the DLR elastic quadruped bert.
Our results in learning trotting and pronking gaits show that exploitation of
the spring actuator dynamics emerges naturally from optimizing for dynamic
motions, yielding high-performing locomotion despite being model-free. The
whole process takes no more than 1.5 hours on the real robot and results in
natural-looking gaits
A Developmental Organization for Robot Behavior
This paper focuses on exploring how learning and development can be structured in synthetic (robot) systems. We present a developmental assembler for constructing reusable and temporally extended actions in a sequence. The discussion adopts the traditions
of dynamic pattern theory in which behavior
is an artifact of coupled dynamical systems
with a number of controllable degrees of freedom. In our model, the events that delineate
control decisions are derived from the pattern
of (dis)equilibria on a working subset of sensorimotor policies. We show how this architecture can be used to accomplish sequential
knowledge gathering and representation tasks
and provide examples of the kind of developmental milestones that this approach has
already produced in our lab
- …