1,967 research outputs found
Reinforcement learning algorithms that assimilate and accommodate skills with multiple tasks
Children are capable of acquiring a large repertoire of motor skills and of efficiently adapting them to novel conditions. In a previous work we proposed a hierarchical modular reinforcement learning model (RANK) that can learn multiple motor skills in continuous action and state spaces. The model is based on a development of the mixture-of-expert model that has been suitably developed to work with reinforcement learning. In particular, the model uses a high-level gating network for assigning responsibilities for acting and for learning to a set of low-level expert networks. The model was also developed with the goal of exploiting the Piagetian mechanisms of assimilation and accommodation to support learning of multiple tasks. This paper proposes a new model (TERL - Transfer Expert Reinforcement Learning) that substantially improves RANK. The key difference with respect to the previous model is the decoupling of the mechanisms that generate the responsibility signals of experts for learning and for control. This made possible to satisfy different constraints for functioning and for learning. We test both the TERL and the RANK models with a two-DOFs dynamic arm engaged in solving multiple reaching tasks, and compare the two with a simple, flat reinforcement learning model. The results show that both models are capable of exploiting assimilation and accommodation processes in order to transfer knowledge between similar tasks, and at the same time to avoid catastrophic interference. Furthermore, the TERL model is shown to significantly outperform the RANK model thanks to its faster and more stable specialization of experts
Modular and hierarchical brain organization to understand assimilation, accommodation and their relation to autism in reaching tasks: a developmental robotics hypothesis
By "assimilation" the child embodies the sensorimotor experience into already built mental structures. Conversely, by "accommodation" these structures are changed according to the child\u27s new experiences. Despite the intuitive power of these concepts to trace the course of sensorimotor development, they have gradually lost ground in psychology. This likely for a lack of brain related views capturing the dynamic mechanisms underlying them. Here we propose that brain modular and hierarchical organization is crucial to understanding assimilation/accommodation. We devised an experiment where a bio-inspired modular and hierarchical mixture-of-experts model guides a simulated robot to learn by trial-and-error different reaching tasks. The model gives a novel interpretation of assimilation/accommodation based on the functional organization of the experts allocated through learning. Assimilation occurs when the model adapts a copy of the expert trained for solving a task to face another task requiring similar sensorimotor mappings. Experts storing similar sensorimotor mappings belong to the same functional module. Accommodation occurs when the model uses non-trained experts to face tasks requiring different sensorimotor mappings (generating a new functional group of experts). The model provides a new theoretical framework to investigate impairments in assimilation/accommodation the autistic syndrome
System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games
As Artificial and Robotic Systems are increasingly deployed and relied upon
for real-world applications, it is important that they exhibit the ability to
continually learn and adapt in dynamically-changing environments, becoming
Lifelong Learning Machines. Continual/lifelong learning (LL) involves
minimizing catastrophic forgetting of old tasks while maximizing a model's
capability to learn new tasks. This paper addresses the challenging lifelong
reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in
L2RL and making L2RL useful for practical applications requires more than
developing individual L2RL algorithms; it requires making progress at the
systems-level, especially research into the non-trivial problem of how to
integrate multiple L2RL algorithms into a common framework. In this paper, we
introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF),
which standardizes L2RL systems and assimilates different continual learning
components (each addressing different aspects of the lifelong learning problem)
into a unified system. As an instantiation of L2RLCF, we develop a standard API
allowing easy integration of novel lifelong learning components. We describe a
case study that demonstrates how multiple independently-developed LL components
can be integrated into a single realized system. We also introduce an
evaluation environment in order to measure the effect of combining various
system components. Our evaluation environment employs different LL scenarios
(sequences of tasks) consisting of Starcraft-2 minigames and allows for the
fair, comprehensive, and quantitative comparison of different combinations of
components within a challenging common evaluation environment.Comment: The Second International Conference on AIML Systems, October 12--15,
2022, Bangalore, Indi
In-service training for computer-aided design in building surveying
The investigation was undertaken firstly to identify, classify and assess requirements
and methods for in-service training in the use of computer-aided design (CAD)
systems in UK building surveying practice. The second purpose was to develop, test
and assess alternative instructional methods for practitioners to acquire and develop
capabilities for appropriate use of CAD.
Requirements, opportunities and constraints were informed through discussion with
practitioners, suppliers of CAD systems or associated services, and a postal survey of
50 UK building surveying practices. Collated information was considered within
Romiszowski's (1984) framework for problem solving in the organisation. Conventional methods for CAD training in the UK construction industry, and relevant
instructional theory, were investigated in a literature search. Alternative instructional
models and methods were identified and developed through an action research
methodology based upon Cohen and Manion (1989). Proposals were assessed
conceptually using the first three of Popper's (1959) four tests for theories. Prototyping
core components, substantially by computer-based methods, and classroom
experiments with students of building surveying, or clients of the Leicester CAD
Centre, both at De Montfort University, were used in place of Popper's fourth test.
The research findings contribute detailed analysis of requirements, provision and
constraints to a sparse knowledge base for use of CAD in building surveying. They
also provide a critical review of conventional methods for developing users of the
technology in this domain. Three core principles are proposed to guide the policies and
actions of building surveying practices in relation to CAD, emphasising integration of
staff development within an overall CAD strategy. An alternative instructional model,
synthesised from results across the research programme, is recommended for
developing relevant practical capabilities with CAD. Corresponding specifications are
made for a hybrid of manual, interpersonal and computer-based methods for its
implementation. The model is set in the context of wider considerations for effective
use of CAD technology, and is independent of particular software systems, types of
workplace and trainee. Theoretically the model is capable of rapidly enabling staff in
any practice to apply relevant CAD hardware and software effectively to authentic
tasks, and subsequently contribute to developing application methods in the workplace. In conjunction with recommended operational principles the alternative instructional
model improves significantly upon conventional methods identified for in-service
training in CAD by provision for strategic integration, system independence, and
responsiveness to local requirements.
The investigation concluded by identifying four foci for further research and
development to overcome constraints on implementing the model by the methods
prototyped. A fifth focus recommends investigation of an optimal model and methods
to develop capabilities of staff in building surveying practices for appraising,
implementing, managing and developing the use of CAD systems
Continual Lifelong Learning with Neural Networks: A Review
Humans and animals have the ability to continually acquire, fine-tune, and
transfer knowledge and skills throughout their lifespan. This ability, referred
to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms
that together contribute to the development and specialization of our
sensorimotor skills as well as to long-term memory consolidation and retrieval.
Consequently, lifelong learning capabilities are crucial for autonomous agents
interacting in the real world and processing continuous streams of information.
However, lifelong learning remains a long-standing challenge for machine
learning and neural network models since the continual acquisition of
incrementally available information from non-stationary data distributions
generally leads to catastrophic forgetting or interference. This limitation
represents a major drawback for state-of-the-art deep neural network models
that typically learn representations from stationary batches of training data,
thus without accounting for situations in which information becomes
incrementally available over time. In this review, we critically summarize the
main challenges linked to lifelong learning for artificial learning systems and
compare existing neural network approaches that alleviate, to different
extents, catastrophic forgetting. We discuss well-established and emerging
research motivated by lifelong learning factors in biological systems such as
structural plasticity, memory replay, curriculum and transfer learning,
intrinsic motivation, and multisensory integration
Final report key contents: main results accomplished by the EU-Funded project IM-CLeVeR - Intrinsically Motivated Cumulative Learning Versatile Robots
This document has the goal of presenting the main scientific and technological achievements of the project IM-CLeVeR. The document is organised as follows: 1. Project executive summary: a brief overview of the project vision, objectives and keywords. 2. Beneficiaries of the project and contacts: list of Teams (partners) of the project, Team Leaders and contacts. 3. Project context and objectives: the vision of the project and its overall objectives 4. Overview of work performed and main results achieved: a one page overview of the main results of the project 5. Overview of main results per partner: a bullet-point list of main results per partners 6. Main achievements in detail, per partner: a throughout explanation of the main results per partner (but including collaboration work), with also reference to the main publications supporting them
Learning object relationships which determine the outcome of actions
Peer reviewedPublisher PD
Autonomous selection of the "what" and the "how" of learning: an intrinsically motivated system tested with a two armed robot
In our previous research we focused on the role of Intrinsically motivated learning signals in driving the selection and learning of different skills. This work makes a further step towards more autonomous and versatile robots, implementing a 3-level hierarchical architecture with the mechanisms necessary to both select goals to pursue and search for the best way to achieve them. In particular, we focus on the important problem of providing artificial agents with a decoupled architecture that separates the selection of goals from the selection of resources. To verify our solution, we use the architecture to control the two redundant arms of a simulated iCub robotic platform tested in a reaching task within a 3D environment. We compare its performance to a previous model having a coupled architecture where the different goals are associated at design-time to different modules pursuing them
- …