Search CORE

31,789 research outputs found

Self-organization of action hierarchy and compositionality by reinforcement learning with recurrent neural networks

Author: Doya Kenji
Han Dongqi
Tani Jun
Publication venue
Publication date: 26/11/2019
Field of study

Recurrent neural networks (RNNs) for reinforcement learning (RL) have shown distinct advantages, e.g., solving memory-dependent tasks and meta-learning. However, little effort has been spent on improving RNN architectures and on understanding the underlying neural mechanisms for performance gain. In this paper, we propose a novel, multiple-timescale, stochastic RNN for RL. Empirical results show that the network can autonomously learn to abstract sub-goals and can self-develop an action hierarchy using internal dynamics in a challenging continuous control task. Furthermore, we show that the self-developed compositionality of the network enhances faster re-learning when adapting to a new task that is a re-composition of previously learned sub-goals, than when starting from scratch. We also found that improved performance can be achieved when neural activities are subject to stochastic rather than deterministic dynamics

arXiv.org e-Print Archive

OIST Institutional Repository

An authoring tool for structuring and annotating on-line educational courses : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Computer Science at Massey University

Author: Wang Yang
Publication venue: 'Massey University'
Publication date: 01/01/2002
Field of study

This thesis studies the design and prototype implementation of a new web-based course authoring system for the Technology Integrated Learning Environment (TILE) project. The TILE authoring system edits the course structure and allows the author to annotate the course structure with meta-data. It makes extensive use of XML technology to communicate structured data across the Internet, as well as for both local and web-side databases. The Authoring tool is designed to support development by multiple authors and has check-in and check - out, as well as version control facilities. It also provides an interface for adopting other multimedia tools such as AudioGraph. The tool has an easy-to-use graphical user interface. The technical problems that have been solved in this project include issues such as cross-platform support, drag and drop functionality using JDK l.l.8, etc. System environments, such as relational database set up, XML database set up, Java swing set up in Mac also have been discussed. The authoring system interface analysis, database analysis and function analysis have been completed for the complete the system as specified. An intermediate system, designed to a reduced specification, has been implemented as a prototype and details of this system, which can work independently of the TILE delivery system, are included. The Full TILE authoring system including InstantDB database access also has been partially implemented. The prototype application has also has been tested on the PC platform

Massey Research Online

Pilot Evaluation of the Mexican Model of Dual TVET in the State of Mexico

Author: Aragón Edgar
Fuentes Hugo
López-Fogués Aurora
Rosado Rene
Valiente Oscar
Publication venue: University of Glasgow and Tecnológico de Monterrey
Publication date: 01/02/2018
Field of study

Since the first public announcement of the Mexican Model of Dual TVET (MMFD) in June 2013, more than 5,000 apprentices have enrolled in the programme and around 2,000 already graduated. The Ministry of Education (SEP and CONALEP), the Chambers of Commerce (i.e. COPARMEX) and the German Cooperation Agencies (i.e. CAMEXA) have been collaborating with state authorities, families, schools and companies to turn this initial idea into a significant and sustainable initiative. Although the numbers are still small, it seemed necessary to undertake a pilot evaluation study of the implementation and impact of this program on its participants to inform those responsible for this policy. We decided to focus our study on the State of Mexico because of the higher number of apprentices in this state and because of the access that the CONALEP authorities gave us to the informants. The report that you are about to read is structured in four main sections. In the first one we reviewed the international evidence on the experiences of policy transfer of Dual TVET. Transferring international good practice sin TVET is always a complex process that requires careful attention to the experiences and lessons from those that tried to do it before. In the second section, we present the main characteristics of the Mexican Model of Dual TVET and the specificities of its implementation in the State of Mexico. In a federal country like Mexico, it is important to understand that national policies may largely vary across states in terms of design and implementation. The third section outlines the methodology of the study, which is inspired by the realist evaluation principles. Realist evaluation, not only tries to measure the impact of interventions on beneficiaries, but also to understand the causal mechanisms that explain why this policy is more effective in certain contexts and with certain beneficiary populations than in others. In the final section, the results of the interviews and the survey with 25 apprentices that completed their studies under the MMFD in the State of Mexico are presented. Obviously, the reduced sample of the study limits the representativeness of our findings but it will offer some expected and unexpected results that should not be ignored by those involved in this policy in the State of Mexico and nationally

Enlighten

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

Author: Grefenstette E
Matthews M
Parker-Holder J
Rocktäschel T
Samvelyan M
Publication venue: Proceedings of Machine Learning Research (PMLR)
Publication date: 01/01/2022
Field of study

Practising and honing skills forms a fundamental component of how humans learn, yet artificial agents are rarely specifically trained to perform them. Instead, they are usually trained end-to-end, with the hope being that useful skills will be implicitly learned in order to maximise discounted return of some extrinsic reward function. In this paper, we investigate how skills can be incorporated into the training of reinforcement learning (RL) agents in complex environments with large state-action spaces and sparse rewards. To this end, we created SkillHack, a benchmark of tasks and associated skills based on the game of NetHack. We evaluate a number of baselines on this benchmark, as well as our own novel skill-based method Hierarchical Kickstarting (HKS), which is shown to outperform all other evaluated methods. Our experiments show that learning with a prior knowledge of useful skills can significantly improve the performance of agents on complex problems. We ultimately argue that utilising predefined skills provides a useful inductive bias for RL problems, especially those with large state-action spaces and sparse rewards

UCL Discovery