173 research outputs found
How to Reuse and Compose Knowledge for a Lifetime of Tasks: A Survey on Continual Learning and Functional Composition
A major goal of artificial intelligence (AI) is to create an agent capable of
acquiring a general understanding of the world. Such an agent would require the
ability to continually accumulate and build upon its knowledge as it encounters
new experiences. Lifelong or continual learning addresses this setting, whereby
an agent faces a continual stream of problems and must strive to capture the
knowledge necessary for solving each new task it encounters. If the agent is
capable of accumulating knowledge in some form of compositional representation,
it could then selectively reuse and combine relevant pieces of knowledge to
construct novel solutions. Despite the intuitive appeal of this simple idea,
the literatures on lifelong learning and compositional learning have proceeded
largely separately. In an effort to promote developments that bridge between
the two fields, this article surveys their respective research landscapes and
discusses existing and future connections between them
Interactive Imitation Learning in Robotics: A Survey
Interactive Imitation Learning (IIL) is a branch of Imitation Learning (IL)
where human feedback is provided intermittently during robot execution allowing
an online improvement of the robot's behavior. In recent years, IIL has
increasingly started to carve out its own space as a promising data-driven
alternative for solving complex robotic tasks. The advantages of IIL are its
data-efficient, as the human feedback guides the robot directly towards an
improved behavior, and its robustness, as the distribution mismatch between the
teacher and learner trajectories is minimized by providing feedback directly
over the learner's trajectories. Nevertheless, despite the opportunities that
IIL presents, its terminology, structure, and applicability are not clear nor
unified in the literature, slowing down its development and, therefore, the
research of innovative formulations and discoveries. In this article, we
attempt to facilitate research in IIL and lower entry barriers for new
practitioners by providing a survey of the field that unifies and structures
it. In addition, we aim to raise awareness of its potential, what has been
accomplished and what are still open research questions. We organize the most
relevant works in IIL in terms of human-robot interaction (i.e., types of
feedback), interfaces (i.e., means of providing feedback), learning (i.e.,
models learned from feedback and function approximators), user experience
(i.e., human perception about the learning process), applications, and
benchmarks. Furthermore, we analyze similarities and differences between IIL
and RL, providing a discussion on how the concepts offline, online, off-policy
and on-policy learning should be transferred to IIL from the RL literature. We
particularly focus on robotic applications in the real world and discuss their
implications, limitations, and promising future areas of research
Machine Learning Meets Advanced Robotic Manipulation
Automated industries lead to high quality production, lower manufacturing
cost and better utilization of human resources. Robotic manipulator arms have
major role in the automation process. However, for complex manipulation tasks,
hard coding efficient and safe trajectories is challenging and time consuming.
Machine learning methods have the potential to learn such controllers based on
expert demonstrations. Despite promising advances, better approaches must be
developed to improve safety, reliability, and efficiency of ML methods in both
training and deployment phases. This survey aims to review cutting edge
technologies and recent trends on ML methods applied to real-world manipulation
tasks. After reviewing the related background on ML, the rest of the paper is
devoted to ML applications in different domains such as industry, healthcare,
agriculture, space, military, and search and rescue. The paper is closed with
important research directions for future works
The State of Lifelong Learning in Service Robots: Current Bottlenecks in Object Perception and Manipulation
Service robots are appearing more and more in our daily life. The development
of service robots combines multiple fields of research, from object perception
to object manipulation. The state-of-the-art continues to improve to make a
proper coupling between object perception and manipulation. This coupling is
necessary for service robots not only to perform various tasks in a reasonable
amount of time but also to continually adapt to new environments and safely
interact with non-expert human users. Nowadays, robots are able to recognize
various objects, and quickly plan a collision-free trajectory to grasp a target
object in predefined settings. Besides, in most of the cases, there is a
reliance on large amounts of training data. Therefore, the knowledge of such
robots is fixed after the training phase, and any changes in the environment
require complicated, time-consuming, and expensive robot re-programming by
human experts. Therefore, these approaches are still too rigid for real-life
applications in unstructured environments, where a significant portion of the
environment is unknown and cannot be directly sensed or controlled. In such
environments, no matter how extensive the training data used for batch
learning, a robot will always face new objects. Therefore, apart from batch
learning, the robot should be able to continually learn about new object
categories and grasp affordances from very few training examples on-site.
Moreover, apart from robot self-learning, non-expert users could interactively
guide the process of experience acquisition by teaching new concepts, or by
correcting insufficient or erroneous concepts. In this way, the robot will
constantly learn how to help humans in everyday tasks by gaining more and more
experiences without the need for re-programming
DREAM Architecture: a Developmental Approach to Open-Ended Learning in Robotics
Robots are still limited to controlled conditions, that the robot designer
knows with enough details to endow the robot with the appropriate models or
behaviors. Learning algorithms add some flexibility with the ability to
discover the appropriate behavior given either some demonstrations or a reward
to guide its exploration with a reinforcement learning algorithm. Reinforcement
learning algorithms rely on the definition of state and action spaces that
define reachable behaviors. Their adaptation capability critically depends on
the representations of these spaces: small and discrete spaces result in fast
learning while large and continuous spaces are challenging and either require a
long training period or prevent the robot from converging to an appropriate
behavior. Beside the operational cycle of policy execution and the learning
cycle, which works at a slower time scale to acquire new policies, we introduce
the redescription cycle, a third cycle working at an even slower time scale to
generate or adapt the required representations to the robot, its environment
and the task. We introduce the challenges raised by this cycle and we present
DREAM (Deferred Restructuring of Experience in Autonomous Machines), a
developmental cognitive architecture to bootstrap this redescription process
stage by stage, build new state representations with appropriate motivations,
and transfer the acquired knowledge across domains or tasks or even across
robots. We describe results obtained so far with this approach and end up with
a discussion of the questions it raises in Neuroscience
Improved Exploration with Stochastic Policies in Deep Reinforcement Learning
Deep reinforcement learning has recently shown promising results in robot control, but even current state-of-the-art algorithms fail in solving seemingly simple realistic
tasks. For example, OpenAI et al. 2019 demonstrate the learning of dexterous in-hand manipulation of objects lying on the palm of an upside oriented robot hand. However, manipulating an object from above (i.e., the hand is oriented upside-down) turns out to be fundamentally more difficult to learn for current algorithms because the object has to be robustly grasped at all times to avoid immediate failure. In this thesis, we identify the commonly used naive exploration strategies as the main issue. Therefore, we propose to utilize more expressive stochastic policy distributions to enable reinforcement learning agents to learn to explore in a targeted manner. In particular, we extend the Soft Actor-Critic algorithm with policy distributions of varying expressiveness. We analyze how these variants explore in simplified environments with adjustable difficulties that we designed specifically to mimic the core problem of dexterous in-hand manipulation. We find that stochastic policies with expressive distributions can learn fundamentally more complex tasks. Moreover, beyond the exploration behavior, we show that in not perfectly observable environments, agents that represent their final (learned) policy with expressive distributions can solve tasks where commonly used simpler distributions fail
Lifelong Machine Learning Of Functionally Compositional Structures
A hallmark of human intelligence is the ability to construct self-contained chunks of knowledge and reuse them in novel combinations for solving different yet structurally related problems. Learning such compositional structures has been a significant challenge for artificial systems, due to the underlying combinatorial search. To date, research into compositional learning has largely proceeded separately from work on lifelong or continual learning. This dissertation integrated these two lines of work to present a general-purpose framework for lifelong learning of functionally compositional structures. The framework separates the learning into two stages: learning how to best combine existing components to assimilate a novel problem, and learning how to adapt the set of existing components to accommodate the new problem. This separation explicitly handles the trade-off between the stability required to remember how to solve earlier tasks and the flexibility required to solve new tasks. This dissertation instantiated the framework into various supervised and reinforcement learning (RL) algorithms. Empirical evaluations on a range of supervised learning benchmarks compared the proposed algorithms against well-established techniques, and found that 1)~compositional models enable improved lifelong learning when the tasks are highly diverse by balancing the incorporation of new knowledge and the retention of past knowledge, 2)~the separation of the learning into stages permits lifelong learning of compositional knowledge, and 3)~the components learned by the proposed methods represent self-contained and reusable functions. Similar evaluations on existing and new RL benchmarks demonstrated that 1)~algorithms under the framework accelerate the discovery of high-performing policies in a variety of domains, including robotic manipulation, and 2)~these algorithms retain, and often improve, knowledge that enables them to solve tasks learned in the past. The dissertation extended one lifelong compositional RL algorithm to the nonstationary setting, where the distribution over tasks varies over time, and found that modularity permits individually tracking changes to different elements in the environment. The final contribution of this dissertation was a new benchmark for evaluating approaches to compositional RL, which exposed that existing methods struggle to discover the compositional properties of the environment
Pretraining in Deep Reinforcement Learning: A Survey
The past few years have seen rapid progress in combining reinforcement
learning (RL) with deep learning. Various breakthroughs ranging from games to
robotics have spurred the interest in designing sophisticated RL algorithms and
systems. However, the prevailing workflow in RL is to learn tabula rasa, which
may incur computational inefficiency. This precludes continuous deployment of
RL algorithms and potentially excludes researchers without large-scale
computing resources. In many other areas of machine learning, the pretraining
paradigm has shown to be effective in acquiring transferable knowledge, which
can be utilized for a variety of downstream tasks. Recently, we saw a surge of
interest in Pretraining for Deep RL with promising results. However, much of
the research has been based on different experimental settings. Due to the
nature of RL, pretraining in this field is faced with unique challenges and
hence requires new design principles. In this survey, we seek to systematically
review existing works in pretraining for deep reinforcement learning, provide a
taxonomy of these methods, discuss each sub-field, and bring attention to open
problems and future directions
- …