Search CORE

3,500 research outputs found

A Deep Hierarchical Approach to Lifelong Learning in Minecraft

Author: Givony Shahar
Mankowitz Daniel J.
Mannor Shie
Tessler Chen
Zahavy Tom
Publication venue
Publication date: 30/11/2016
Field of study

We propose a lifelong learning system that has the ability to reuse and transfer knowledge from one task to another while efficiently retaining the previously learned knowledge-base. Knowledge is transferred by learning reusable skills to solve tasks in Minecraft, a popular video game which is an unsolved and high-dimensional lifelong learning problem. These reusable skills, which we refer to as Deep Skill Networks, are then incorporated into our novel Hierarchical Deep Reinforcement Learning Network (H-DRLN) architecture using two techniques: (1) a deep skill array and (2) skill distillation, our novel variation of policy distillation (Rusu et. al. 2015) for learning skills. Skill distillation enables the HDRLN to efficiently retain knowledge and therefore scale in lifelong learning, by accumulating knowledge and encapsulating multiple reusable skills into a single distilled network. The H-DRLN exhibits superior performance and lower learning sample complexity compared to the regular Deep Q Network (Mnih et. al. 2015) in sub-domains of Minecraft

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Effectiveness of Two Keyboarding Instructional Approaches on the Keyboarding Speed, Accuracy, and Technique of Elementary Students

Author: Donica Denise K
Giroux Peter
Kim Young Joo
Publication venue: ScholarWorks at WMU
Publication date: 15/10/2019
Field of study

Background: Keyboarding skill development is important for elementary students. Limited research exists to inform practice on effective keyboarding instruction methods. Method: Using a quasi-experimental design, we examined the effectiveness of Keyboarding Without Tears® (n = 786) in the experimental schools compared to the control schools who used the district standard instructional approach of free web-based activities (n = 953) on improving keyboarding skills (speed, accuracy, and technique) in elementary students. Results: The results showed significant improvements in keyboarding speed and accuracy in all schools for all grades favoring the experimental schools compared to the control schools. Significant differences in improvements in keyboarding technique were found with large effect sizes favoring the experimental schools for kindergarten to the second grade and small effect sizes favoring the control schools for the third to fifth grade. Conclusion: Professionals involved in assisting with keyboarding skill development in children are recommended to begin training in these skills in early elementary grades, especially to assist in proper keyboarding technique development. While using free web-based activities are beneficial to improving keyboarding speed and accuracy, as well as keyboarding technique, using a developmentally-based curriculum, such as Keyboarding Without Tears®, may further enhance improvements in the keyboarding skills of elementary students

Crossref

ScholarWorks at WMU

Evolutionary Reinforcement Learning: A Survey

Author: Bai Hui
Cheng Ran
Jin Yaochu
Publication venue
Publication date: 10/03/2023
Field of study

Reinforcement learning (RL) is a machine learning approach that trains agents to maximize cumulative rewards through interactions with environments. The integration of RL with deep learning has recently resulted in impressive achievements in a wide range of challenging tasks, including board games, arcade games, and robot control. Despite these successes, there remain several crucial challenges, including brittle convergence properties caused by sensitive hyperparameters, difficulties in temporal credit assignment with long time horizons and sparse rewards, a lack of diverse exploration, especially in continuous search space scenarios, difficulties in credit assignment in multi-agent reinforcement learning, and conflicting objectives for rewards. Evolutionary computation (EC), which maintains a population of learning agents, has demonstrated promising performance in addressing these limitations. This article presents a comprehensive survey of state-of-the-art methods for integrating EC into RL, referred to as evolutionary reinforcement learning (EvoRL). We categorize EvoRL methods according to key research fields in RL, including hyperparameter optimization, policy search, exploration, reward shaping, meta-RL, and multi-objective RL. We then discuss future research directions in terms of efficient methods, benchmarks, and scalable platforms. This survey serves as a resource for researchers and practitioners interested in the field of EvoRL, highlighting the important challenges and opportunities for future research. With the help of this survey, researchers and practitioners can develop more efficient methods and tailored benchmarks for EvoRL, further advancing this promising cross-disciplinary research field

arXiv.org e-Print Archive

Algorithms for Adaptive Game-playing Agents

Author: Justesen Niels Orsleff
Publication venue: IT-Universitetet i København
Publication date: 01/01/2019
Field of study

The IT University of Copenhagen's Repository

Accessibility-Based Clustering for Efficient Learning of Locomotion Skills

Author: Li Z
Yu W
Zhang C
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/03/2022
Field of study

For model-free deep reinforcement learning of quadruped locomotion, the initialization of robot configurations is crucial for data efficiency and robustness. This work focuses on algorithmic improvements of data efficiency and robustness simultaneously through automatic discovery of initial states, which is achieved by our proposed K-Access algorithm based on accessibility metrics. Specifically, we formulated accessibility metrics to measure the difficulty of transitions between two arbitrary states, and proposed a novel K-Access algorithm for state-space clustering that automatically discovers the centroids of the static-pose clusters based on the accessibility metrics. By using the discovered centroidal static poses as the initial states, we can improve data efficiency by reducing redundant explorations, and enhance the robustness by more effective explorations from the centroids to sampled poses. Focusing on fall recovery as a very hard set of locomotion skills, we validated our method extensively using an 8-DoF quadrupedal robot Bittle. Compared to the baselines, the learning curve of our method converges much faster, requiring only 60% of training episodes. With our method, the robot can successfully recover to standing poses within 3 seconds in 99.4% of the test cases. Moreover, the method can generalize to other difficult skills successfully, such as backflipping

arXiv.org e-Print Archive

UCL Discovery