8 research outputs found

    小脑模型神经网络研究和发展综述

    Get PDF
    小脑模型神经网络是Albus 在一系列基础应用论文中被提出的。首先分析了小脑模型神经 网络的生物学基础、基本原理和学习算法及其扩展。在此基础上综述了小脑模型的研究进展和及 其它的一些应用。小脑模型是一种局部学习网络,结构简单,收敛速度快,易于软硬件实现,因而 具有广泛的应用前景,最后预测了小脑模型未来的发展趋势。工业自动化国家重点实验室基金项目(K01001

    Hierarchically Clustered Adaptive Quantization CMAC and Its Learning Convergence

    Get PDF
    No abstract availabl

    Hierarchical Reinforcement Learning

    Full text link

    Stability and weight smoothing in CMAC neural networks

    Get PDF
    Although the CMAC (Cerebellar Model Articulation Controller) neural network has been successfully used in control systems for many years, its property of local generalization, the availability of trained information for network responses at adjacent untrained locations, although responsible for the networks rapid learning and efficient implementation, results in network responses that is, when trained with sparse or widely spaced training data, spiky in nature even when the underlying function being learned is quite smooth. Since the derivative of such a network response can vary widely, the CMAC\u27s usefulness for solving optimization problems as well as for certain other control system applications can be severely limited. This dissertation presents the CMAC algorithm in sufficient detail to explore its strengths and weaknesses. Its properties of information generalization and storage are discussed and comparisons are made with other neural network algorithms and with other adaptive control algorithms. A synopsis of the development of the fields of neural networks and adaptive control is included to lend historical perspective. A stability analysis of the CMAC algorithm for open-loop function learning is developed. This stability analysis casts the function learning problem as a unique implementation of the model reference structure and develops a Lyapunov function to prove convergence of the CMAC to the target model. A new CMAC learning rule is developed by treating the CMAC as a set of simultaneous equations in a constrained optimization problem and making appropriate choices for the weight penalty matrix in the cost equation. This dissertation then presents a new CMAC learning algorithm which has the property of weight smoothing to improve generalization, function approximation in partially trained networks and the partial derivatives of learned functions. This new learning algorithm is significant in that it derives from an optimum solution and demonstrates a dramatic performance improvement for function learning in the presence of widely spaced training data. Developed from a completely unique analytical direction, this algorithm represents a coupling and extension of single- and multi-resolution CMAC algorithms developed by other researchers. The insights derived from the analysis of the optimum solution and the resulting new learning rules are discussed and suggestions for future work are presented

    Reinforcement learning in continuous state and action spaces

    Get PDF
    Many traditional reinforcement-learning algorithms have been designed for problems with small finite state and action spaces. Learning in such discrete problems can been difficult, due to noise and delayed reinforcements. However, many real-world problems have continuous state or action spaces, which can make learning a good decision policy even more involved. In this chapter we discuss how to automatically find good decision policies in continuous domains. Because analytically computing a good policy from a continuous model can be infeasible, in this chapter we mainly focus on methods that explicitly update a representation of a value function, a policy or both. We discuss considerations in choosing an appropriate representation for these functions and discuss gradient-based and gradient-free ways to update the parameters. We show how to apply these methods to reinforcement-learning problems and discuss many specific algorithms. Amongst others, we cover gradient-based temporal-difference learning, evolutionary strategies, policy-gradient algorithms and actor-critic methods. We discuss the advantages of different approaches and compare the performance of a state-of-the-art actor-critic method and a state-of-the-art evolutionary strategy empirically

    A Framework for Aggregation of Multiple Reinforcement Learning Algorithms

    Get PDF
    Aggregation of multiple Reinforcement Learning (RL) algorithms is a new and effective technique to improve the quality of Sequential Decision Making (SDM). The quality of a SDM depends on long-term rewards rather than the instant rewards. RL methods are often adopted to deal with SDM problems. Although many RL algorithms have been developed, none is consistently better than the others. In addition, the parameters of RL algorithms significantly influence learning performances. There is no universal rule to guide the choice of algorithms and the setting of parameters. To handle this difficulty, a new multiple RL system - Aggregated Multiple Reinforcement Learning System (AMRLS) is developed. In AMRLS, each RL algorithm (learner) learns individually in a learning module and provides its output to an intelligent aggregation module. The aggregation module dynamically aggregates these outputs and provides a final decision. Then, all learners take the action and update their policies individually. The two processes are performed alternatively. AMRLS can deal with dynamic learning problems without the need to search for the optimal learning algorithm or the optimal values of learning parameters. It is claimed that several complementary learning algorithms can be integrated in AMRLS to improve the learning performance in terms of success rate, robustness, confidence, redundance, and complementariness. There are two strategies for learning an optimal policy with RL methods. One is based on Value Function Learning (VFL), which learns an optimal policy expressed as a value function. The Temporal Difference RL (TDRL) methods are examples of this strategy. The other is based on Direct Policy Search (DPS), which directly searches for the optimal policy in the potential policy space. The Genetic Algorithms (GAs)-based RL (GARL) are instances of this strategy. A hybrid learning architecture of GARL and TDRL, HGATDRL, is proposed to combine them together to improve the learning ability. AMRLS and HGATDRL are tested on several SDM problems, including the maze world problem, pursuit domain problem, cart-pole balancing system, mountain car problem, and flight control system. Experimental results show that the proposed framework and method can enhance the learning ability and improve learning performance of a multiple RL system

    Scaling-up reinforcement learning using parallelization and symbolic planning

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore