145 research outputs found

    Adaptive dynamic programming with eligibility traces and complexity reduction of high-dimensional systems

    Get PDF
    This dissertation investigates the application of a variety of computational intelligence techniques, particularly clustering and adaptive dynamic programming (ADP) designs especially heuristic dynamic programming (HDP) and dual heuristic programming (DHP). Moreover, a one-step temporal-difference (TD(0)) and n-step TD (TD(λ)) with their gradients are utilized as learning algorithms to train and online-adapt the families of ADP. The dissertation is organized into seven papers. The first paper demonstrates the robustness of model order reduction (MOR) for simulating complex dynamical systems. Agglomerative hierarchical clustering based on performance evaluation is introduced for MOR. This method computes the reduced order denominator of the transfer function by clustering system poles in a hierarchical dendrogram. Several numerical examples of reducing techniques are taken from the literature to compare with our work. In the second paper, a HDP is combined with the Dyna algorithm for path planning. The third paper uses DHP with an eligibility trace parameter (λ) to track a reference trajectory under uncertainties for a nonholonomic mobile robot by using a first-order Sugeno fuzzy neural network structure for the critic and actor networks. In the fourth and fifth papers, a stability analysis for a model-free action-dependent HDP(λ) is demonstrated with batch- and online-implementation learning, respectively. The sixth work combines two different gradient prediction levels of critic networks. In this work, we provide a convergence proofs. The seventh paper develops a two-hybrid recurrent fuzzy neural network structures for both critic and actor networks. They use a novel n-step gradient temporal-difference (gradient of TD(λ)) of an advanced ADP algorithm called value-gradient learning (VGL(λ)), and convergence proofs are given. Furthermore, the seventh paper is the first to combine the single network adaptive critic with VGL(λ). --Abstract, page iv

    Adapting Swarm Intelligence For The Self-Assembly And Optimization Of Networks

    Get PDF
    While self-assembly is a fairly active area of research in swarm intelligence and robotics, relatively little attention has been paid to the issues surrounding the construction of network structures. Here, methods developed previously for modeling and controlling the collective movements of groups of agents are extended to serve as the basis for self-assembly or "growth" of networks, using neural networks as a concrete application to evaluate this novel approach. One of the central innovations incorporated into the model presented here is having network connections arise as persistent "trails" left behind moving agents, trails that are reminiscent of pheromone deposits made by agents in ant colony optimization models. The resulting network connections are thus essentially a record of agent movements. The model's effectiveness is demonstrated by using it to produce two large networks that support subsequent learning of topographic and feature maps. Improvements produced by the incorporation of collective movements are also examined through computational experiments. These results indicate that methods for directing collective movements can be extended to support and facilitate network self-assembly. Additionally, the traditional self-assembly problem is extended to include the generation of network structures based on optimality criteria, rather than on target structures that are specified a priori. It is demonstrated that endowing the network components involved in the self-assembly process with the ability to engage in collective movements can be an effective means of generating computationally optimal network structures. This is confirmed on a number of challenging test problems from the domains of trajectory generation, time-series forecasting, and control. Further, this extension of the model is used to illuminate an important relationship between particle swarm optimization, which usually occurs in high dimensional abstract spaces, and self-assembly, which is normally grounded in real and simulated 2D and 3D physical spaces

    Goal-Based Control and Planning in Biped Locomotion Using Computational Intelligence Methods

    Get PDF
    Este trabajo explora la aplicación de campos neuronales, a tareas de control dinámico en el domino de caminata bípeda. En una primera aproximación, se propone una arquitectura de control que usa campos neuronales en 1D. Esta arquitectura de control es evaluada en el problema de estabilidad para el péndulo invertido de carro y barra, usado como modelo simplificado de caminata bípeda. El controlador por campos neuronales, parametrizado tanto manualmente como usando un algoritmo evolutivo (EA), se compara con una arquitectura de control basada en redes neuronales recurrentes (RNN), también parametrizada por por un EA. El controlador por campos neuronales parametrizado por EA se desempeña mejor que el parametrizado manualmente, y es capaz de recuperarse rápidamente de las condiciones iniciales más problemáticas. Luego, se desarrolla una arquitectura extendida de control y planificación usando campos neurales en 2D, y se aplica al problema caminata bípeda simple (SBW). Para ello se usa un conjunto de valores _óptimos para el parámetro de control, encontrado previamente usando algoritmos evolutivos. El controlador óptimo por campos neuronales obtenido se compara con el controlador lineal propuesto por Wisse et al., y a un controlador _optimo tabular que usa los mismos parámetros óptimos. Si bien los controladores propuestos para el problema SBW implementan una estrategia activa de control, se aproximan de manera más cercana a la caminata dinámica pasiva (PDW) que trabajos previos, disminuyendo la acción de control acumulada. / Abstract. This work explores the application of neural fields to dynamical control tasks in the domain of biped walking. In a first approximation, a controller architecture that uses 1D neural fields is proposed. This controller architecture is evaluated using the stability problem for the cart-and-pole inverted pendulum, as a simplified biped walking model. The neural field controller is compared, parameterized both manually and using an evolutionary algorithm (EA), to a controller architecture based on a recurrent neural neuron (RNN), also parametrized by an EA. The non-evolved neural field controller performs better than the RNN controller. Also, the evolved neural field controller performs better than the non-evolved one and is able to recover fast from worst-case initial conditions. Then, an extended control and planning architecture using 2D neural fields is developed and applied to the SBW problem. A set of optimal parameter values, previously found using an EA, is used as parameters for neural field controller. The optimal neural field controller is compared to the linear controller proposed by Wisse et al., and to a table-lookup controller using the same optimal parameters. While being an active control strategy, the controllers proposed here for the SBW problem approach more closely Passive Dynamic Walking (PDW) than previous works, by diminishing the cumulative control action.Maestrí
    • …
    corecore