131 research outputs found

    Bayesian Nonparametric Adaptive Control using Gaussian Processes

    Get PDF
    This technical report is a preprint of an article submitted to a journal.Most current Model Reference Adaptive Control (MRAC) methods rely on parametric adaptive elements, in which the number of parameters of the adaptive element are fixed a priori, often through expert judgment. An example of such an adaptive element are Radial Basis Function Networks (RBFNs), with RBF centers pre-allocated based on the expected operating domain. If the system operates outside of the expected operating domain, this adaptive element can become non-effective in capturing and canceling the uncertainty, thus rendering the adaptive controller only semi-global in nature. This paper investigates a Gaussian Process (GP) based Bayesian MRAC architecture (GP-MRAC), which leverages the power and flexibility of GP Bayesian nonparametric models of uncertainty. GP-MRAC does not require the centers to be preallocated, can inherently handle measurement noise, and enables MRAC to handle a broader set of uncertainties, including those that are defined as distributions over functions. We use stochastic stability arguments to show that GP-MRAC guarantees good closed loop performance with no prior domain knowledge of the uncertainty. Online implementable GP inference methods are compared in numerical simulations against RBFN-MRAC with preallocated centers and are shown to provide better tracking and improved long-term learning.This research was supported in part by ONR MURI Grant N000141110688 and NSF grant ECS #0846750

    Experimental Results of Concurrent Learning Adaptive Controllers

    Get PDF
    Commonly used Proportional-Integral-Derivative based UAV flight controllers are often seen to provide adequate trajectory-tracking performance only after extensive tuning. The gains of these controllers are tuned to particular platforms, which makes transferring controllers from one UAV to other time-intensive. This paper suggests the use of adaptive controllers in speeding up the process of extracting good control performance from new UAVs. In particular, it is shown that a concurrent learning adaptive controller improves the trajectory tracking performance of a quadrotor with baseline linear controller directly imported from another quadrotors whose inertial characteristics and throttle mapping are very di fferent. Concurrent learning adaptive control uses specifi cally selected and online recorded data concurrently with instantaneous data and is capable of guaranteeing tracking error and weight error convergence without requiring persistency of excitation. Flight-test results are presented on indoor quadrotor platforms operated in MIT's RAVEN environment. These results indicate the feasibility of rapidly developing high-performance UAV controllers by using adaptive control to augment a controller transferred from another UAV with similar control assignment structure.United States. Office of Naval Research. Multidisciplinary University Research Initiative (Grant N000141110688)National Science Foundation (U.S.). Graduate Research Fellowship Program (Grant 0645960)Boeing Scientific Research Laboratorie

    Rapid transfer of controllers between UAVs using learning-based adaptive control

    Get PDF
    Commonly used Proportional-Integral-Derivative based UAV flight controllers are often seen to provide adequate trajectory-tracking performance, but only after extensive tuning. The gains of these controllers are tuned to particular platforms, which makes transferring controllers from one UAV to other time-intensive. This paper formulates the problem of control-transfer from a source system to a transfer system and proposes a solution that leverages well-studied techniques in adaptive control. It is shown that concurrent learning adaptive controllers improve the trajectory tracking performance of a quadrotor with the baseline linear controller directly imported from another quadrotor whose inertial characteristics and throttle mapping are very different. Extensive flight-testing, using indoor quadrotor platforms operated in MIT's RAVEN environment, is used to validate the method.United States. Office of Naval Research. Multidisciplinary University Research Initiative (Grant N000141110688

    Improving the Practicality of Model-Based Reinforcement Learning: An Investigation into Scaling up Model-Based Methods in Online Settings

    Get PDF
    This thesis is a response to the current scarcity of practical model-based control algorithms in the reinforcement learning (RL) framework. As of yet there is no consensus on how best to integrate imperfect transition models into RL whilst mitigating policy improvement instabilities in online settings. Current state-of-the-art policy learning algorithms that surpass human performance often rely on model-free approaches that enjoy unmitigated sampling of transition data. Model-based RL (MBRL) instead attempts to distil experience into transition models that allow agents to plan new policies without needing to return to the environment and sample more data. The initial focus of this investigation is on kernel conditional mean embeddings (CMEs) (Song et al., 2009) deployed in an approximate policy iteration (API) algorithm (Grünewälder et al., 2012a). This existing MBRL algorithm boasts theoretically stable policy updates in continuous state and discrete action spaces. The Bellman operator’s value function and (transition) conditional expectation are modelled and embedded respectively as functions in a reproducing kernel Hilbert space (RKHS). The resulting finite-induced approximate pseudo-MDP (Yao et al., 2014a) can be solved exactly in a dynamic programming algorithm with policy improvement suboptimality guarantees. However model construction and policy planning scale cubically and quadratically respectively with the training set size, rendering the CME impractical for sampleabundant tasks in online settings. Three variants of CME API are investigated to strike a balance between stable policy updates and reduced computational complexity. The first variant models the value function and state-action representation explicitly in a parametric CME (PCME) algorithm with favourable computational complexity. However a soft conservative policy update technique is developed to mitigate policy learning oscillations in the planning process. The second variant returns to the non-parametric embedding and contributes (along with external work) to the compressed CME (CCME); a sparse and computationally more favourable CME. The final variant is a fully end-to-end differentiable embedding trained with stochastic gradient updates. The value function remains modelled in an RKHS such that backprop is driven by a non-parametric RKHS loss function. Actively compressed CME (ACCME) satisfies the pseudo-MDP contraction constraint using a sparse softmax activation function. The size of the pseudo-MDP (i.e. the size of the embedding’s last layer) is controlled by sparsifying the last layer weight matrix by extending the truncated gradient method (Langford et al., 2009) with group lasso updates in a novel ‘use it or lose it’ neuron pruning mechanism. Surprisingly this technique does not require extensive fine-tuning between control tasks

    Sparse online Gaussian process adaptation for incremental backstepping flight control

    Get PDF
    Presence of uncertainties caused by unforeseen malfunctions in actuation or measurement systems or changes in aircraft behaviour could lead to aircraft loss-of-control during flight. This paper considers sparse online Gaussian Processes (GP) adaptive augmentation for Incremental Backstepping (IBKS) flight control. IBKS uses angular accelerations and control deflections to reduce the dependency on the aircraft model. However, it requires knowledge of the relationship between inner and outer loops and control effectiveness. Proposed indirect adaptation significantly reduces model dependency. Global uniform ultimate boundness is proved for the resultant GP adaptive IBKS. Conducted research shows that if the input-affine property is violated, e.g., in severe conditions with a combination of multiple failures, the IBKS can lose stability. Meanwhile, the proposed sparse GP-based estimator provides fast online identification and the resultant controller demonstrates improved stability and tracking performance

    Bioinspired composite learning control under discontinuous friction for industrial robots

    Full text link
    Adaptive control can be applied to robotic systems with parameter uncertainties, but improving its performance is usually difficult, especially under discontinuous friction. Inspired by the human motor learning control mechanism, an adaptive learning control approach is proposed for a broad class of robotic systems with discontinuous friction, where a composite error learning technique that exploits data memory is employed to enhance parameter estimation. Compared with the classical feedback error learning control, the proposed approach can achieve superior transient and steady-state tracking without high-gain feedback and persistent excitation at the cost of extra computational burden and memory usage. The performance improvement of the proposed approach has been verified by experiments based on a DENSO industrial robot.Comment: Submitted to 2022 IFAC International Workshop on Adaptive and Learning Control System

    An Application of Kolmogorov's Superposition Theorem to Function Reconstruction in Higher Dimensions

    Get PDF
    In this thesis we present a Regularization Network approach to reconstruct a continuous function ƒ:[0,1]n→R from its function values ƒ(xj) on discrete data points xj, j=1,…,P. The ansatz is based on a new constructive version of Kolmogorov's superposition theorem. Typically, the numerical solution of mathematical problems underlies the so--called curse of dimensionality. This term describes the exponential dependency of the involved numerical costs on the dimensionality n. To circumvent the curse at least to some extend, typically higher regularity assumptions on the function ƒ are made which however are unrealistic in most cases. Therefore, we employ a representation of the function as superposition of one--dimensional functions which does not require higher smoothness assumptions on ƒ than continuity. To this end, a constructive version of Kolmogorov's superposition theorem which is based on D. Sprecher is adapted in such a manner that one single outer function Φ and a universal inner function ψ suffice to represent the function ƒ. Here, ψ is the extension of a function which was defined by M. Köppen on a dense subset of the real line. The proofs of existence, continuity, and monotonicity are presented in this thesis. To compute the outer function Φ, we adapt a constructive algorithm by Sprecher such that in each iteration step, depending on ƒ, an element of a sequence of univariate functions { Φr}r is computed. It will be shown that this sequence converges to a continuous limit Φ:R→R. This constructively proves Kolmogorov's superposition theorem with a single outer and inner function. Due to the fact that the numerical complexity to compute the outer function Φ by the algorithm grows exponentially with the dimensionality, we alternatively present a Regularization Network approach which is based on this representation. Here, the outer function is computed from discrete function samples (xj,ƒ(xj)), j=1,…,P. The model to reconstruct ƒ will be introduced in two steps. First, the outer function Φ is represented in a finite basis with unknown coefficients which are then determined by a variational formulation, i.e. by the minimization of a regularized empirical error functional. A detailed numerical analysis of this model shows that the dimensionality of ƒ is transformed by Kolmogorov's representation into oscillations of Φ. Thus, the use of locally supported basis functions leads to an exponential growth of the complexity since the spatial mesh resolution has to resolve the strong oscillations. Furthermore, a numerical analysis of the Fourier transform of Φ shows that the locations of the relevant frequencies in Fourier space can be determined a priori and are independent of ƒ. It also reveals a product structure of the outer function and directly motivates the definition of the final model. Therefore, Φ is replaced in the second step by a product of functions for which each factor is expanded in a Fourier basis with appropriate frequency numbers. Again, the coefficients in the expansions are determined by the minimization of a regularized empirical error functional. For both models, the underlying approximation spaces are developed by means of reproducing kernel Hilbert spaces and the corresponding norms are the respective regularization terms in the empirical error functionals. Thus, both approaches can be interpreted as Regularization Networks. However, it is important to note that the error functional for the second model is not convex and that nonlinear minimizers have to be used for the computation of the model parameters. A detailed numerical analysis of the product model shows that it is capable of reconstructing functions which depend on up to ten variables.Eine Anwendung von Kolmogorovs Superpositionen Theorem zur Funktionsrekonstruktion in höheren Dimensionen In der vorliegenden Arbeit wird ein Regularisierungsnetzwerk zur Rekonstruktion von stetigen Funktionen ƒ:[0,1]n→R vorgestellt, welches direkt auf einer neuen konstruktiven Version von Kolmogorovs Superpositionen Theorem basiert. Dabei sind lediglich die Funktionswerte ƒ(xj) an diskreten Datenpunktenxj, j=1,…,P bekannt. Typischerweise leidet die numerische Lösung mathematischer Probleme unter dem sogenannten Fluch der Dimension. Dieser Begriff beschreibt das exponentielle Wachstum der Komplexität des verwendeten Verfahrens mit der Dimension n. Um dies zumindest teilweise zu vermeiden, werden üblicherweise höhere Regularitätsannahmen an die Lösung des Problems gemacht, was allerdings häufig unrealistisch ist. Daher wird in dieser Arbeit eine Darstellung der Funktion ƒ als Superposition eindimensionaler Funktionen verwendet, welche keiner höheren Regularitätsannahmen als Stetigkeit bedarf. Zu diesem Zweck wird eine konstruktive Variante des Kolmogorov Superpositionen Theorems, welche auf D. Sprecher zurückgeht, so angepasst, dass nur eine äußere Funktion Φ sowie eine universelle innere Funktion ψ zur Darstellung von ƒ notwendig ist. Die Funktion ψ ist nach einer Definition von M. Köppen explizit und unabhängig von ƒ als Fortsetzung einer Funktion, welche auf einer Dichten Teilmenge der reellen Achse definiert ist, gegeben. Der fehlende Beweis von Existenz, Stetigkeit und Monotonie von ψ wird in dieser Arbeit geführt. Zur Berechnung der äußeren Funktion Φ wird ein iterativer Algorithmus von Sprecher so modifiziert, dass jeder Iterationsschritt, abhängig von ƒ, ein Element einer Folge univariater Funktionen{ Φr}r liefert. Es wird gezeigt werden, dass die Folge gegen einen stetigen Grenzwert Φ:R→R konvergiert. Dies liefert einen konstruktiven Beweis einer neuen Version des Kolmogorov Superpositionen Theorems mit einer äußeren und einer inneren Funktion. Da die numerische Komplexität des Algorithmus zur Berechnung von Φ exponentiell mit der Dimension wächst, stellen wir alternativ ein Regularisierungsnetzwerk, basierend auf dieser Darstellung, vor. Dabei wird die äußere Funktion aus gegebenen Daten (xj,ƒ(xj)), j=1,…,P berechnet. Das Modell zur Rekonstruktion von ƒ wird in zwei Schritten eingeführt. Zunächst wird zur Definition eines vorläufigen Modells die äußere Funktion, bzw. eine Approximation an Φ, in einer endlichen Basis mit unbekannten Koeffizienten dargestellt. Diese werden dann durch eine Variationsformulierung bestimmt, d.h. durch die Minimierung eines regularisierten empirischen Fehlerfunktionals. Eine detaillierte numerische Analyse zeigt dann, dass Kolmogorovs Darstellung die Dimensionalität von ƒ in Oszillationen von F transformiert. Somit ist die Verwendung von Basisfunktionen mit lokalem Träger nicht geeignet, da die räumliche Auflösung der Approximation die starken Oszillationen erfassen muss. Des Weiteren zeigt eine Analyse der Fouriertransformation von Φ, dass die relevanten Frequenzen, unabhängig von ƒ, a priori bestimmbar sind, und dass die äußere Funktion Produktstruktur aufweist. Dies motiviert die Definition des endgültigen Modells. Dazu wird Φ nun durch ein Produkt von Funktionen ersetzt und jeder Faktor in einer Fourierbasis entwickelt. Die Koeffizienten werden ebenfalls durch Minimierung eines regularisierten empirischen Fehlerfunktionals bestimmt. Für beide Modelle wird ein theoretischer Rahmen in Form von Hilberträumen mit reproduzierendem Kern entwickelt. Die zugehörigen Normen bilden dabei jeweils den Regularisierungsterm der entsprechenden Fehlerfunktionale. Somit können beide Ansätze als Regularisierungsnetzwerke interpretiert werden. Allerdings ist zu beachten, dass das Fehlerfunktional für den Produktansatz nicht konvex ist und nichtlineare Minimierungsverfahren zur Berechnung der Koeffizienten notwendig sind. Weitere ausführliche numerische Tests zeigen, dass dieses Modell in der Lage ist Funktionen zu rekonstruieren welche von bis zu zehn Variablen abhängen

    Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling

    Get PDF
    This paper aims at providing an in-depth overview of designing interpretable fuzzy inference models from data within a unified framework. The objective of complex system modelling is to develop reliable and understandable models for human being to get insights into complex real-world systems whose first-principle models are unknown. Because system behaviour can be described naturally as a series of linguistic rules, data-driven fuzzy modelling becomes an attractive and widely used paradigm for this purpose. However, fuzzy models constructed from data by adaptive learning algorithms usually suffer from the loss of model interpretability. Model accuracy and interpretability are two conflicting objectives, so interpretation preservation during adaptation in data-driven fuzzy system modelling is a challenging task, which has received much attention in fuzzy system modelling community. In order to clearly discriminate the different roles of fuzzy sets, input variables, and other components in achieving an interpretable fuzzy model, a taxonomy of fuzzy model interpretability is first proposed in terms of low-level interpretability and high-level interpretability in this paper. The low-level interpretability of fuzzy models refers to fuzzy model interpretability achieved by optimizing the membership functions in terms of semantic criteria on fuzzy set level, while the high-level interpretability refers to fuzzy model interpretability obtained by dealing with the coverage, completeness, and consistency of the rules in terms of the criteria on fuzzy rule level. Some criteria for low-level interpretability and high-level interpretability are identified, respectively. Different data-driven fuzzy modelling techniques in the literature focusing on the interpretability issues are reviewed and discussed from the perspective of low-level interpretability and high-level interpretability. Furthermore, some open problems about interpretable fuzzy models are identified and some potential new research directions on fuzzy model interpretability are also suggested. Crown Copyright © 2008
    corecore