Search CORE

8 research outputs found

小脑模型神经网络研究和发展综述

Author: 万太平
曾文华
Publication venue: 杭州电子工业学院学报
Publication date: 01/02/2003
Field of study

小脑模型神经网络是Albus 在一系列基础应用论文中被提出的。首先分析了小脑模型神经网络的生物学基础、基本原理和学习算法及其扩展。在此基础上综述了小脑模型的研究进展和及其它的一些应用。小脑模型是一种局部学习网络,结构简单,收敛速度快,易于软硬件实现,因而具有广泛的应用前景,最后预测了小脑模型未来的发展趋势。工业自动化国家重点实验室基金项目(K01001

Xiamen University Institutional Repository

Hierarchically Clustered Adaptive Quantization CMAC and Its Learning Convergence

Author: Lai Edmund M-K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/11/2007
Field of study

No abstract availabl

Massey Research Online

Hierarchical Reinforcement Learning

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Stability and weight smoothing in CMAC neural networks

Author: Campagna David Paul
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/1998
Field of study

Although the CMAC (Cerebellar Model Articulation Controller) neural network has been successfully used in control systems for many years, its property of local generalization, the availability of trained information for network responses at adjacent untrained locations, although responsible for the networks rapid learning and efficient implementation, results in network responses that is, when trained with sparse or widely spaced training data, spiky in nature even when the underlying function being learned is quite smooth. Since the derivative of such a network response can vary widely, the CMAC\u27s usefulness for solving optimization problems as well as for certain other control system applications can be severely limited. This dissertation presents the CMAC algorithm in sufficient detail to explore its strengths and weaknesses. Its properties of information generalization and storage are discussed and comparisons are made with other neural network algorithms and with other adaptive control algorithms. A synopsis of the development of the fields of neural networks and adaptive control is included to lend historical perspective. A stability analysis of the CMAC algorithm for open-loop function learning is developed. This stability analysis casts the function learning problem as a unique implementation of the model reference structure and develops a Lyapunov function to prove convergence of the CMAC to the target model. A new CMAC learning rule is developed by treating the CMAC as a set of simultaneous equations in a constrained optimization problem and making appropriate choices for the weight penalty matrix in the cost equation. This dissertation then presents a new CMAC learning algorithm which has the property of weight smoothing to improve generalization, function approximation in partially trained networks and the partial derivatives of learned functions. This new learning algorithm is significant in that it derives from an optimum solution and demonstrates a dramatic performance improvement for function learning in the presence of widely spaced training data. Developed from a completely unique analytical direction, this algorithm represents a coupling and extension of single- and multi-resolution CMAC algorithms developed by other researchers. The insights derived from the analysis of the optimum solution and the resulting new learning rules are discussed and suggestions for future work are presented

UNH Scholars' Repository

Reinforcement learning in continuous state and action spaces

Author: Hasselt H. P. (Hado) van
Publication venue: Springer Berlin Heidelberg
Publication date: 01/04/2012
Field of study

Many traditional reinforcement-learning algorithms have been designed for problems with small finite state and action spaces. Learning in such discrete problems can been difficult, due to noise and delayed reinforcements. However, many real-world problems have continuous state or action spaces, which can make learning a good decision policy even more involved. In this chapter we discuss how to automatically find good decision policies in continuous domains. Because analytically computing a good policy from a continuous model can be infeasible, in this chapter we mainly focus on methods that explicitly update a representation of a value function, a policy or both. We discuss considerations in choosing an appropriate representation for these functions and discuss gradient-based and gradient-free ways to update the parameters. We show how to apply these methods to reinforcement-learning problems and discuss many specific algorithms. Amongst others, we cover gradient-based temporal-difference learning, evolutionary strategies, policy-gradient algorithms and actor-critic methods. We discuss the advantages of different approaches and compare the performance of a state-of-the-art actor-critic method and a state-of-the-art evolutionary strategy empirically

CWI's Institutional Repository

A Framework for Aggregation of Multiple Reinforcement Learning Algorithms

Author: Jiang Ju
Publication venue: 'University of Waterloo'
Publication date: 01/01/2007
Field of study

Aggregation of multiple Reinforcement Learning (RL) algorithms is a new and effective technique to improve the quality of Sequential Decision Making (SDM). The quality of a SDM depends on long-term rewards rather than the instant rewards. RL methods are often adopted to deal with SDM problems. Although many RL algorithms have been developed, none is consistently better than the others. In addition, the parameters of RL algorithms significantly influence learning performances. There is no universal rule to guide the choice of algorithms and the setting of parameters. To handle this difficulty, a new multiple RL system - Aggregated Multiple Reinforcement Learning System (AMRLS) is developed. In AMRLS, each RL algorithm (learner) learns individually in a learning module and provides its output to an intelligent aggregation module. The aggregation module dynamically aggregates these outputs and provides a final decision. Then, all learners take the action and update their policies individually. The two processes are performed alternatively. AMRLS can deal with dynamic learning problems without the need to search for the optimal learning algorithm or the optimal values of learning parameters. It is claimed that several complementary learning algorithms can be integrated in AMRLS to improve the learning performance in terms of success rate, robustness, confidence, redundance, and complementariness. There are two strategies for learning an optimal policy with RL methods. One is based on Value Function Learning (VFL), which learns an optimal policy expressed as a value function. The Temporal Difference RL (TDRL) methods are examples of this strategy. The other is based on Direct Policy Search (DPS), which directly searches for the optimal policy in the potential policy space. The Genetic Algorithms (GAs)-based RL (GARL) are instances of this strategy. A hybrid learning architecture of GARL and TDRL, HGATDRL, is proposed to combine them together to improve the learning ability. AMRLS and HGATDRL are tested on several SDM problems, including the maze world problem, pursuit domain problem, cart-pole balancing system, mountain car problem, and flight control system. Experimental results show that the proposed framework and method can enhance the learning ability and improve learning performance of a multiple RL system

University of Waterloo's Institutional Repository

Scaling-up reinforcement learning using parallelization and symbolic planning

Author: Grounds Matthew Jon
Publication venue: University of York
Publication date: 01/01/2007
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

White Rose E-theses Online

OpenGrey Repository