1,967 research outputs found

    Reinforcement learning algorithms that assimilate and accommodate skills with multiple tasks

    Get PDF
    Children are capable of acquiring a large repertoire of motor skills and of efficiently adapting them to novel conditions. In a previous work we proposed a hierarchical modular reinforcement learning model (RANK) that can learn multiple motor skills in continuous action and state spaces. The model is based on a development of the mixture-of-expert model that has been suitably developed to work with reinforcement learning. In particular, the model uses a high-level gating network for assigning responsibilities for acting and for learning to a set of low-level expert networks. The model was also developed with the goal of exploiting the Piagetian mechanisms of assimilation and accommodation to support learning of multiple tasks. This paper proposes a new model (TERL - Transfer Expert Reinforcement Learning) that substantially improves RANK. The key difference with respect to the previous model is the decoupling of the mechanisms that generate the responsibility signals of experts for learning and for control. This made possible to satisfy different constraints for functioning and for learning. We test both the TERL and the RANK models with a two-DOFs dynamic arm engaged in solving multiple reaching tasks, and compare the two with a simple, flat reinforcement learning model. The results show that both models are capable of exploiting assimilation and accommodation processes in order to transfer knowledge between similar tasks, and at the same time to avoid catastrophic interference. Furthermore, the TERL model is shown to significantly outperform the RANK model thanks to its faster and more stable specialization of experts

    Modular and hierarchical brain organization to understand assimilation, accommodation and their relation to autism in reaching tasks: a developmental robotics hypothesis

    Get PDF
    By "assimilation" the child embodies the sensorimotor experience into already built mental structures. Conversely, by "accommodation" these structures are changed according to the child\u27s new experiences. Despite the intuitive power of these concepts to trace the course of sensorimotor development, they have gradually lost ground in psychology. This likely for a lack of brain related views capturing the dynamic mechanisms underlying them. Here we propose that brain modular and hierarchical organization is crucial to understanding assimilation/accommodation. We devised an experiment where a bio-inspired modular and hierarchical mixture-of-experts model guides a simulated robot to learn by trial-and-error different reaching tasks. The model gives a novel interpretation of assimilation/accommodation based on the functional organization of the experts allocated through learning. Assimilation occurs when the model adapts a copy of the expert trained for solving a task to face another task requiring similar sensorimotor mappings. Experts storing similar sensorimotor mappings belong to the same functional module. Accommodation occurs when the model uses non-trained experts to face tasks requiring different sensorimotor mappings (generating a new functional group of experts). The model provides a new theoretical framework to investigate impairments in assimilation/accommodation the autistic syndrome

    System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games

    Full text link
    As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.Comment: The Second International Conference on AIML Systems, October 12--15, 2022, Bangalore, Indi

    Computer Assisted Learning: Its Educational Potential (UNCAL)

    Get PDF

    In-service training for computer-aided design in building surveying

    Get PDF
    The investigation was undertaken firstly to identify, classify and assess requirements and methods for in-service training in the use of computer-aided design (CAD) systems in UK building surveying practice. The second purpose was to develop, test and assess alternative instructional methods for practitioners to acquire and develop capabilities for appropriate use of CAD. Requirements, opportunities and constraints were informed through discussion with practitioners, suppliers of CAD systems or associated services, and a postal survey of 50 UK building surveying practices. Collated information was considered within Romiszowski's (1984) framework for problem solving in the organisation. Conventional methods for CAD training in the UK construction industry, and relevant instructional theory, were investigated in a literature search. Alternative instructional models and methods were identified and developed through an action research methodology based upon Cohen and Manion (1989). Proposals were assessed conceptually using the first three of Popper's (1959) four tests for theories. Prototyping core components, substantially by computer-based methods, and classroom experiments with students of building surveying, or clients of the Leicester CAD Centre, both at De Montfort University, were used in place of Popper's fourth test. The research findings contribute detailed analysis of requirements, provision and constraints to a sparse knowledge base for use of CAD in building surveying. They also provide a critical review of conventional methods for developing users of the technology in this domain. Three core principles are proposed to guide the policies and actions of building surveying practices in relation to CAD, emphasising integration of staff development within an overall CAD strategy. An alternative instructional model, synthesised from results across the research programme, is recommended for developing relevant practical capabilities with CAD. Corresponding specifications are made for a hybrid of manual, interpersonal and computer-based methods for its implementation. The model is set in the context of wider considerations for effective use of CAD technology, and is independent of particular software systems, types of workplace and trainee. Theoretically the model is capable of rapidly enabling staff in any practice to apply relevant CAD hardware and software effectively to authentic tasks, and subsequently contribute to developing application methods in the workplace. In conjunction with recommended operational principles the alternative instructional model improves significantly upon conventional methods identified for in-service training in CAD by provision for strategic integration, system independence, and responsiveness to local requirements. The investigation concluded by identifying four foci for further research and development to overcome constraints on implementing the model by the methods prototyped. A fifth focus recommends investigation of an optimal model and methods to develop capabilities of staff in building surveying practices for appraising, implementing, managing and developing the use of CAD systems

    Continual Lifelong Learning with Neural Networks: A Review

    Full text link
    Humans and animals have the ability to continually acquire, fine-tune, and transfer knowledge and skills throughout their lifespan. This ability, referred to as lifelong learning, is mediated by a rich set of neurocognitive mechanisms that together contribute to the development and specialization of our sensorimotor skills as well as to long-term memory consolidation and retrieval. Consequently, lifelong learning capabilities are crucial for autonomous agents interacting in the real world and processing continuous streams of information. However, lifelong learning remains a long-standing challenge for machine learning and neural network models since the continual acquisition of incrementally available information from non-stationary data distributions generally leads to catastrophic forgetting or interference. This limitation represents a major drawback for state-of-the-art deep neural network models that typically learn representations from stationary batches of training data, thus without accounting for situations in which information becomes incrementally available over time. In this review, we critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting. We discuss well-established and emerging research motivated by lifelong learning factors in biological systems such as structural plasticity, memory replay, curriculum and transfer learning, intrinsic motivation, and multisensory integration

    Final report key contents: main results accomplished by the EU-Funded project IM-CLeVeR - Intrinsically Motivated Cumulative Learning Versatile Robots

    Get PDF
    This document has the goal of presenting the main scientific and technological achievements of the project IM-CLeVeR. The document is organised as follows: 1. Project executive summary: a brief overview of the project vision, objectives and keywords. 2. Beneficiaries of the project and contacts: list of Teams (partners) of the project, Team Leaders and contacts. 3. Project context and objectives: the vision of the project and its overall objectives 4. Overview of work performed and main results achieved: a one page overview of the main results of the project 5. Overview of main results per partner: a bullet-point list of main results per partners 6. Main achievements in detail, per partner: a throughout explanation of the main results per partner (but including collaboration work), with also reference to the main publications supporting them

    Autonomous selection of the "what" and the "how" of learning: an intrinsically motivated system tested with a two armed robot

    Get PDF
    In our previous research we focused on the role of Intrinsically motivated learning signals in driving the selection and learning of different skills. This work makes a further step towards more autonomous and versatile robots, implementing a 3-level hierarchical architecture with the mechanisms necessary to both select goals to pursue and search for the best way to achieve them. In particular, we focus on the important problem of providing artificial agents with a decoupled architecture that separates the selection of goals from the selection of resources. To verify our solution, we use the architecture to control the two redundant arms of a simulated iCub robotic platform tested in a reaching task within a 3D environment. We compare its performance to a previous model having a coupled architecture where the different goals are associated at design-time to different modules pursuing them
    corecore