132 research outputs found

    Echo state model of non-Markovian reinforcement learning, An

    Get PDF
    Department Head: Dale H. Grit.2008 Spring.Includes bibliographical references (pages 137-142).There exists a growing need for intelligent, autonomous control strategies that operate in real-world domains. Theoretically the state-action space must exhibit the Markov property in order for reinforcement learning to be applicable. Empirical evidence, however, suggests that reinforcement learning also applies to domains where the state-action space is approximately Markovian, a requirement for the overwhelming majority of real-world domains. These domains, termed non-Markovian reinforcement learning domains, raise a unique set of practical challenges. The reconstruction dimension required to approximate a Markovian state-space is unknown a priori and can potentially be large. Further, spatial complexity of local function approximation of the reinforcement learning domain grows exponentially with the reconstruction dimension. Parameterized dynamic systems alleviate both embedding length and state-space dimensionality concerns by reconstructing an approximate Markovian state-space via a compact, recurrent representation. Yet this representation extracts a cost; modeling reinforcement learning domains via adaptive, parameterized dynamic systems is characterized by instability, slow-convergence, and high computational or spatial training complexity. The objectives of this research are to demonstrate a stable, convergent, accurate, and scalable model of non-Markovian reinforcement learning domains. These objectives are fulfilled via fixed point analysis of the dynamics underlying the reinforcement learning domain and the Echo State Network, a class of parameterized dynamic system. Understanding models of non-Markovian reinforcement learning domains requires understanding the interactions between learning domains and their models. Fixed point analysis of the Mountain Car Problem reinforcement learning domain, for both local and nonlocal function approximations, suggests a close relationship between the locality of the approximation and the number and severity of bifurcations of the fixed point structure. This research suggests the likely cause of this relationship: reinforcement learning domains exist within a dynamic feature space in which trajectories are analogous to states. The fixed point structure maps dynamic space onto state-space. This explanation suggests two testable hypotheses. Reinforcement learning is sensitive to state-space locality because states cluster as trajectories in time rather than space. Second, models using trajectory-based features should exhibit good modeling performance and few changes in fixed point structure. Analysis of performance of lookup table, feedforward neural network, and Echo State Network (ESN) on the Mountain Car Problem reinforcement learning domain confirm these hypotheses. The ESN is a large, sparse, randomly-generated, unadapted recurrent neural network, which adapts a linear projection of the target domain onto the hidden layer. ESN modeling results on reinforcement learning domains show it achieves performance comparable to lookup table and neural network architectures on the Mountain Car Problem with minimal changes to fixed point structure. Also, the ESN achieves lookup table caliber performance when modeling Acrobot, a four-dimensional control problem, but is less successful modeling the lower dimensional Modified Mountain Car Problem. These performance discrepancies are attributed to the ESN’s excellent ability to represent complex short term dynamics, and its inability to consolidate long temporal dependencies into a static memory. Without memory consolidation, reinforcement learning domains exhibiting attractors with multiple dynamic scales are unlikely to be well-modeled via ESN. To mediate this problem, a simple ESN memory consolidation method is presented and tested for stationary dynamic systems. These results indicate the potential to improve modeling performance in reinforcement learning domains via memory consolidation

    Expanding the theoretical framework of reservoir computing

    Get PDF

    Recurrent Neural Network Based Control for Risers and Oil Wells

    Get PDF
    TCC(graduação) - Universidade Federal de Santa Catarina. Centro Tecnológico. Engenharia de Controle e Automação.Redes Neurais Recorrentes tendem a ser custosas de se otimizar, porém possuem proprie- dades desejáveis para identificação de sistemas dinâmicos e servem como aproximadores universais dos mesmos. Para diminuir este custo considerado impraticável, surgiu na literatura as Redes de Estado de Echo (Echo State Networks). Echo State Networks são Redes Neurais Recorrentes divididas em duas partes: uma rede de neurônios reccorentes, chamada de reservatório, onde os pesos são fixos e inicializados aleatóriamente e uma camada composta de neurônios estáticos, utilizados para computar a saída do modelo de aprendizagem dinâmica. Somente os pesos de saída desta rede são treinados, podendo ser utilizados algoritmos do tipo mínimos quadrados. Devido a estas propriedades, tais redes podem aproximar sistemas dinâmicos complexos custando baixo esforço computa- tional, tendo obtido resultados promissores em aplicações de identificação e controle em malha fechada de sistemas dinâmicos. Há demonstrações promissoras do uso desse tipo de modelo em problemas envolvendo a indústria de petróleo e gás. Ao mesmo tempo, na industria de petróleo, várias abordagens são desenvolvidas para resolver o problem de golfadas utilizando controle em malha fechada. O problema de golfadas é pertinente numa plataforma de produção por ser capaz de causar grandes prejuizos na produção de petróleo, acarretando em perdas financeiras severas. Pensando nesta aplicação, este trabalho emprega uma estratégia de controle adaptativo utilizando Redes de Estado de Eco para se aproximar o modelo inverso do sistema controlado para o cálculo da ação de controle. Esta abordagem foi aplicada no controle da pressão de fundo de um poço de petróleo, juntamente com o controle anti-golfadas de um “riser”, cujo modelo estava submetido à um severo regime de golfadas. Para os experimentos, foram utilizados modelos já presentes em literatura para simulações. Testes de rejeição de perturbação e seguimento de referência foram aplicados no poço de petróleo. Para o riser, foi testado qual o ponto de equilíbrio estável com maior abertura do choke de produção que o riser consegue manter. Com base nos resultados obtidos, o presente trabalho demonstrou a aplicabilidade das Redes de Estado de Eco ao controle de plantas de produção da indústria de petróleo e gás e também demonstrou sua capacidade em efetuar a estabilização de regimes severos de golfadas.Recurrent Neural Networks (RNN) tend to be costly to optimize, though they posess desir- able properties for dynamic system identification and serve as an universal approximator for these systems. To diminish this cost which can make RNNs impracticable, Echo State Networks were proposed in literature. Echo State Networks (ESN) are Recurrent Neural Networks and are divided in two parts: a recurrent netwok, named reservoir, in which the weights involved are fixed and randomly initialized; and a readout layer, composed of static neurons, where the output of an Echo State Network is computed. Only the weights from the readout layer are trained. In this training, relatively low cost algorithms such as the least squares can be used. Due to these properties, ESN can approximate complex dynamical systems with relatively low computational effort and global minima guarantee, and has obtained promising results in system identification and closed loop control of dynamic systems. There are successful demonstrations of ESN application in oil and gas plants. At the same time, in oil industry, several approaches are developed to solve the slugging flow problem utilizing feedback control. slugging flow problems are pertinent in oil platforms due to being capable of hindering significantly oil production, implying severe financial loss. With this application in mind, this work uses an adaptive control utilizing ESN to approximate the controlled system’s inverse model to calculate the control action. This approach was applied to control the bottomhole pressure of an oil well and to apply anti-slug control of a pipeline-riser system which was subject to severe slugging flow regime. For the experiments, computer simulations were made utilizing models already stablished in literature. The closed-loop control of the oil well was subject to setpoint tracking and disturbance rejection tests. For the riser, it was tested which is the largest choke opening in which the riser maintains pressure stability, which corresponds to the maximum production without slugging flow. Based on the obtained results, this work demonstrated te applicability of ESN in oil production plants control and stabilization of severe slugging

    Inferring unknown unknowns: Regularized bias-aware ensemble Kalman filter

    Full text link
    Because of physical assumptions and numerical approximations, low-order models are affected by uncertainties in the state and parameters, and by model biases. Model biases, also known as model errors or systematic errors, are difficult to infer because they are `unknown unknowns', i.e., we do not necessarily know their functional form a priori. With biased models, data assimilation methods may be ill-posed because either (i) they are 'bias-unaware' because the estimators are assumed unbiased, (ii) they rely on an a priori parametric model for the bias, or (iii) they can infer model biases that are not unique for the same model and data. First, we design a data assimilation framework to perform combined state, parameter, and bias estimation. Second, we propose a mathematical solution with a sequential method, i.e., the regularized bias-aware ensemble Kalman Filter (r-EnKF), which requires a model of the bias and its gradient (i.e., the Jacobian). Third, we propose an echo state network as the model bias estimator. We derive the Jacobian of the network, and design a robust training strategy with data augmentation to accurately infer the bias in different scenarios. Fourth, we apply the r-EnKF to nonlinearly coupled oscillators (with and without time-delay) affected by different forms of bias. The r-EnKF infers in real-time parameters and states, and a unique bias. The applications that we showcase are relevant to acoustics, thermoacoustics, and vibrations; however, the r-EnKF opens new opportunities for combined state, parameter and bias estimation for real-time and on-the-fly prediction in nonlinear systems.Comment: 22 Figure

    Reservoir Computing for Learning in Structured Domains

    Get PDF
    The study of learning models for direct processing complex data structures has gained an increasing interest within the Machine Learning (ML) community during the last decades. In this concern, efficiency, effectiveness and adaptivity of the ML models on large classes of data structures represent challenging and open research issues. The paradigm under consideration is Reservoir Computing (RC), a novel and extremely efficient methodology for modeling Recurrent Neural Networks (RNN) for adaptive sequence processing. RC comprises a number of different neural models, among which the Echo State Network (ESN) probably represents the most popular, used and studied one. Another research area of interest is represented by Recursive Neural Networks (RecNNs), constituting a class of neural network models recently proposed for dealing with hierarchical data structures directly. In this thesis the RC paradigm is investigated and suitably generalized in order to approach the problems arising from learning in structured domains. The research studies described in this thesis cover classes of data structures characterized by increasing complexity, from sequences, to trees and graphs structures. Accordingly, the research focus goes progressively from the analysis of standard ESNs for sequence processing, to the development of new models for trees and graphs structured domains. The analysis of ESNs for sequence processing addresses the interesting problem of identifying and characterizing the relevant factors which influence the reservoir dynamics and the ESN performance. Promising applications of ESNs in the emerging field of Ambient Assisted Living are also presented and discussed. Moving towards highly structured data representations, the ESN model is extended to deal with complex structures directly, resulting in the proposed TreeESN, which is suitable for domains comprising hierarchical structures, and Graph-ESN, which generalizes the approach to a large class of cyclic/acyclic directed/undirected labeled graphs. TreeESNs and GraphESNs represent both novel RC models for structured data and extremely efficient approaches for modeling RecNNs, eventually contributing to the definition of an RC framework for learning in structured domains. The problem of adaptively exploiting the state space in GraphESNs is also investigated, with specific regard to tasks in which input graphs are required to be mapped into flat vectorial outputs, resulting in the GraphESN-wnn and GraphESN-NG models. As a further point, the generalization performance of the proposed models is evaluated considering both artificial and complex real-world tasks from different application domains, including Chemistry, Toxicology and Document Processing
    corecore