9 research outputs found

    A new kernel-based approach for overparameterized Hammerstein system identification

    Full text link
    In this paper we propose a new identification scheme for Hammerstein systems, which are dynamic systems consisting of a static nonlinearity and a linear time-invariant dynamic system in cascade. We assume that the nonlinear function can be described as a linear combination of pp basis functions. We reconstruct the pp coefficients of the nonlinearity together with the first nn samples of the impulse response of the linear system by estimating an npnp-dimensional overparameterized vector, which contains all the combinations of the unknown variables. To avoid high variance in these estimates, we adopt a regularized kernel-based approach and, in particular, we introduce a new kernel tailored for Hammerstein system identification. We show that the resulting scheme provides an estimate of the overparameterized vector that can be uniquely decomposed as the combination of an impulse response and pp coefficients of the static nonlinearity. We also show, through several numerical experiments, that the proposed method compares very favorably with two standard methods for Hammerstein system identification.Comment: 17 pages, submitted to IEEE Conference on Decision and Control 201

    Comparison of least squares and exponential sine sweep methods for Parallel Hammerstein Models estimation

    Get PDF
    Linearity is a common assumption for many real-life systems, but in many cases the nonlinear behavior of systems cannot be ignored and must be modeled and estimated. Among the various existing classes of nonlinear models, Parallel Hammerstein Models (PHM) are interesting as they are at the same time easy to interpret as well as to estimate. One way to estimate PHM relies on the fact that the estimation problem is linear in the parameters and thus that classical least squares (LS) estimation algorithms can be used. In that area, this article introduces a regularized LS estimation algorithm inspired on some of the recently developed regularized impulse response estimation techniques. Another mean to estimate PHM consists in using parametric or non-parametric exponential sine sweeps (ESS) based methods. These methods (LS and ESS) are founded on radically different mathematical backgrounds but are expected to tackle the same issue. A methodology is proposed here to compare them with respect to (i) their accuracy, (ii) their computational cost, and (iii) their robustness to noise. Tests are performed on simulated systems for several values of methods respective parameters and of signal to noise ratio. Results show that, for a given set of data points, the ESS method is less demanding in computational resources than the LS method but that it is also less accurate. Furthermore, the LS method needs parameters to be set in advance whereas the ESS method is not subject to conditioning issues and can be fully non-parametric. In summary, for a given set of data points, ESS method can provide a first, automatic, and quick overview of a nonlinear system than can guide more computationally demanding and precise methods, such as the regularized LS one proposed here

    Design of nonlinear controllers through the virtual reference method and regularization

    Get PDF
    This work proposes a new extension for the nonlinear formulation of the data-driven control method known as the Nonlinear Virtual Reference Feedback Tuning. When the process to be controlled contains a significant quantity of noise, the standard Nonlinear VRFT approach – that uses the Least Squares method – yield estimates with poor statistical properties. These properties may lead the control system to undesirable closed loop performances and even instability. With the intention to improve these statistical properties and controller sparsity and hence, the system’s closed loop performance, this work proposes the use of ℓ1 regularization on the nonlinear formulation of the VRFT method. Regularization is a component that has been extensively employed and researched in the Machine Learning and System Identification communities lately. Furthermore, this technique is appropriate to reduce the variance in the estimates. A detailed analysis of the noise effect on the estimate is made for the Nonlinear VRFT method. Finally, three different regularization methods, the third one proposed in this work, are compared to the standard Nonlinear VRFT.Este trabalho propõe uma nova extensão para a formulação não linear do método de controle orientado por dados conhecido como Método da Referência Virtual Não Linear, ou Nonlinear Virtual Reference Feedback Tuning – denominado aqui somente como VRFT. Quando o processo a ser controlado contém uma quantidade significativa de ruído, a abordagem padrão do VRFT – que usa o método dos Mínimos Quadrados – fornece estimativas com propriedades estatísticas pobres. Essas propriedades podem levar o sistema de controle a desempenhos indesejáveis em malha fechada. Com a intenção de melhorar essas propriedades estatística, identificar um controlador simples em quantidade de parâmetros e melhorar o desempenho em malha fechada do sistema, este trabalho propõe o uso da regularização ℓ1 na formulação não linear do método VRFT. A regularização é uma técnica que tem sido amplamente empregada e pesquisada nas comunidades de Aprendizagem de Máquina e Identificação de Sistemas ultimamente. Além disso, esta técnica é apropriada para reduzir a variância das estimativas. Uma análise detalhada do efeito do ruído na estimativa é feita para o método VRFT não linear. Finalmente, três diferentes métodos de regularização, o terceiro proposto neste trabalho, são comparados com o VRFT

    Robust identification of non-autonomous dynamical systems using stochastic dynamics models

    Full text link
    This paper considers the problem of system identification (ID) of linear and nonlinear non-autonomous systems from noisy and sparse data. We propose and analyze an objective function derived from a Bayesian formulation for learning a hidden Markov model with stochastic dynamics. We then analyze this objective function in the context of several state-of-the-art approaches for both linear and nonlinear system ID. In the former, we analyze least squares approaches for Markov parameter estimation, and in the latter, we analyze the multiple shooting approach. We demonstrate the limitations of the optimization problems posed by these existing methods by showing that they can be seen as special cases of the proposed optimization objective under certain simplifying assumptions: conditional independence of data and zero model error. Furthermore, we observe that our proposed approach has improved smoothness and inherent regularization that make it well-suited for system ID and provide mathematical explanations for these characteristics' origins. Finally, numerical simulations demonstrate a mean squared error over 8.7 times lower compared to multiple shooting when data are noisy and/or sparse. Moreover, the proposed approach can identify accurate and generalizable models even when there are more parameters than data or when the underlying system exhibits chaotic behavior

    A Behavioral Approach to Robust Machine Learning

    Get PDF
    Machine learning is revolutionizing almost all fields of science and technology and has been proposed as a pathway to solving many previously intractable problems such as autonomous driving and other complex robotics tasks. While the field has demonstrated impressive results on certain problems, many of these results have not translated to applications in physical systems, partly due to the cost of system fail- ure and partly due to the difficulty of ensuring reliable and robust model behavior. Deep neural networks, for instance, have simultaneously demonstrated both incredible performance in game playing and image processing, and remarkable fragility. This combination of high average performance and a catastrophically bad worst case performance presents a serious danger as deep neural networks are currently being used in safety critical tasks such as assisted driving. In this thesis, we propose a new approach to training models that have built in robustness guarantees. Our approach to ensuring stability and robustness of the models trained is distinct from prior methods; where prior methods learn a model and then attempt to verify robustness/stability, we directly optimize over sets of models where the necessary properties are known to hold. Specifically, we apply methods from robust and nonlinear control to the analysis and synthesis of recurrent neural networks, equilibrium neural networks, and recurrent equilibrium neural networks. The techniques developed allow us to enforce properties such as incremental stability, incremental passivity, and incremental l2 gain bounds / Lipschitz bounds. A central consideration in the development of our model sets is the difficulty of fitting models. All models can be placed in the image of a convex set, or even R^N , allowing useful properties to be easily imposed during the training procedure via simple interior point methods, penalty methods, or unconstrained optimization. In the final chapter, we study the problem of learning networks of interacting models with guarantees that the resulting networked system is stable and/or monotone, i.e., the order relations between states are preserved. While our approach to learning in this chapter is similar to the previous chapters, the model set that we propose has a separable structure that allows for the scalable and distributed identification of large-scale systems via the alternating directions method of multipliers (ADMM)

    Modeling Asymmetries in Financial Data with Multiplicative Error Models

    Get PDF
    This thesis addresses modeling of financial time series, especially stock market returns and daily price ranges. Modeling data of this kind can be approached with so-called multiplicative error models (MEM). These models nest several well known time series models such as GARCH, ACD and CARR models. They are able to capture many well established features of financial time series including volatility clustering and leptokurtosis. In contrast to these phenomena, different kinds of asymmetries have received relatively little attention in the existing literature. In this thesis asymmetries arise from various sources. They are observed in both conditional and unconditional distributions, for variables with non-negative values and for variables that have values on the real line. In the multivariate context asymmetries can be observed in the marginal distributions as well as in the relationships of the variables modeled. New methods for all these cases are proposed. Chapter 2 considers GARCH models and modeling of returns of two stock market indices. The chapter introduces the so-called generalized hyperbolic (GH) GARCH model to account for asymmetries in both conditional and unconditional distribution. In particular, two special cases of the GARCH-GH model which describe the data most accurately are proposed. They are found to improve the fit of the model when compared to symmetric GARCH models. The advantages of accounting for asymmetries are also observed through Value-at-Risk applications. Both theoretical and empirical contributions are provided in Chapter 3 of the thesis. In this chapter the so-called mixture conditional autoregressive range (MCARR) model is introduced, examined and applied to daily price ranges of the Hang Seng Index. The conditions for the strict and weak stationarity of the model as well as an expression for the autocorrelation function are obtained by writing the MCARR model as a first order autoregressive process with random coefficients. The chapter also introduces inverse gamma (IG) distribution to CARR models. The advantages of CARR-IG and MCARR-IG specifications over conventional CARR models are found in the empirical application both in- and out-of-sample. Chapter 4 discusses the simultaneous modeling of absolute returns and daily price ranges. In this part of the thesis a vector multiplicative error model (VMEM) with asymmetric Gumbel copula is found to provide substantial benefits over the existing VMEM models based on elliptical copulas. The proposed specification is able to capture the highly asymmetric dependence of the modeled variables thereby improving the performance of the model considerably. The economic significance of the results obtained is established when the information content of the volatility forecasts derived is examined.Tässä väitöskirjassa tarkastellaan rahoitusaikasarjojen, erityisesti osakkeiden tuottojen ja osakkeiden hintojen päivittäisen vaihteluvälin, mallintamista. Tämän tyyppisten aineistojen mallintamisessa voidaan käyttää niin sanottuja multiplikatiivisen virhetermin malleja (MEM). Näihin malleihin kuuluvat monet tunnetut aikasarjamallit, kuten GARCH-, ACD- ja CARR-mallit. Niillä voidaan mallintaa useita rahoitusaikasarjalle tyypillisiä piirteitä, kuten volatiliteetin klusteroitumista sekä jakauman paksuhäntäisyyttä. Vastoin kuin edellä mainittuja ilmiöitä, on erilaisia aineistossa havaittavia epäsymmetrisyyksiä käsitelty aikaisemmassa kirjallisuudessa verraten vähän. Tutkimuksessa tarkasteltavia epäsymmetrisyyksiä havaitaan aineistossa sekä ehdollisessa että ehdottomassa jakaumassa ja niin ei-negatiivisilla muuttujilla kuin muuttujilla, jotka saavat arvoja koko reaaliakselillakin. Moniulotteisessa tapauksessa epäsymmetrisyyksiä on reunajakaumissa ja muuttujien välisissä riippuvuuksissa. Väitöskirjassa esitetään uusia menetelmiä kaikkiin edellä mainittuihin tapauksiin. Luku 2 käsittelee GARCH-malleja ja osakeindeksien tuottojen mallintamista. Aineiston epäsymmetrisyyksien huomioimiseksi luvussa esitellään niin sanottu yleistetty hyperbolinen GARCH-malli. Mallista voidaan erottaa kaksi erikoistapausta, jotka eniten parantavat mallin sopivuutta symmetrisiin vaihtoehtoihin verrattuna. Epäsymmetrisyyden huomiointi osoittautuu edulliseksi myös niin sanotuissa Value-at-Risk sovelluksissa. Luvun 3 aiheena on ehdollisen autoregressiivisen vaihteluvälin sekoitusmalli (MCARR). Mallille esitetään ehdot vahvalle ja heikolle stationaarisuudelle ja johdetaan autokorrelaatiofunktio kirjoittamalla malli uudelleen moniulotteisena ensimmäisen kertaluvun satunnaiskertoimisena autoregressiivisenä prosessina. Luvussa ehdotetaan myös käänteisen gammajakauman (IG) käyttöä CARR-mallinnuksessa. CARR-IG- ja MCARR-IG-mallien edut aikaisemmin ehdotettuihin vaihtoehtoihin verrattuna tulevat esille sovelluksessa Hang Seng -indeksin vaihteluväliin sekä estimointi- että ennusteperiodilla. Neljännessä luvussa mallinnetaan samanaikaisesti tuottojen itseisarvoa ja hintojen päivittäistä vaihteluväliä. Luvussa todetaan epäsymmetriseen Gumbel-kopulaan perustuvan vektorimultiplikatiivisen virhetermin mallin (VMEM) tarjoavan huomattavia etuja verrattaessa sitä aikaisemmin esiteltyihin elliptisiin kopuloihin perustuviin VMEM-malleihin. Ehdotetulla spesifikaatiolla voidaan ottaa huomioon mallinnettavien muuttujien vahvasti epäsymmetrinen riippuvuus. Näin ollen mallin sopivuus aineistoon paranee huomattavasti. Viitteitä saatujen tulosten taloudellisesta merkitsevyydestä nähdään tarkasteltaessa eri malleista johdettujen volatiliteettiennusteiden informaatiosisältöä
    corecore