thesis

Non-Parametric Bayesian Methods for Linear System Identification

Abstract

Recent contributions have tackled the linear system identification problem by means of non-parametric Bayesian methods, which are built on largely adopted machine learning techniques, such as Gaussian Process regression and kernel-based regularized regression. Following the Bayesian paradigm, these procedures treat the impulse response of the system to be estimated as the realization of a Gaussian process. Typically, a Gaussian prior accounting for stability and smoothness of the impulse response is postulated, as a function of some parameters (called hyper-parameters in the Bayesian framework). These are generally estimated by maximizing the so-called marginal likelihood, i.e. the likelihood after the impulse response has been marginalized out. Once the hyper-parameters have been fixed in this way, the final estimator is computed as the conditional expected value of the impulse response w.r.t. the posterior distribution, which coincides with the minimum variance estimator. Assuming that the identification data are corrupted by Gaussian noise, the above-mentioned estimator coincides with the solution of a regularized estimation problem, in which the regularization term is the l2 norm of the impulse response, weighted by the inverse of the prior covariance function (a.k.a. kernel in the machine learning literature). Recent works have shown how such Bayesian approaches are able to jointly perform estimation and model selection, thus overcoming one of the main issues affecting parametric identification procedures, that is complexity selection. While keeping the classical system identification methods (e.g. Prediction Error Methods and subspace algorithms) as a benchmark for numerical comparison, this thesis extends and analyzes some key aspects of the above-mentioned Bayesian procedure. In particular, four main topics are considered. 1. PRIOR DESIGN. Adopting Maximum Entropy arguments, a new type of l2 regularization is derived: the aim is to penalize the rank of the block Hankel matrix built with Markov coefficients, thus controlling the complexity of the identified model, measured by its McMillan degree. By accounting for the coupling between different input-output channels, this new prior results particularly suited when dealing for the identification of MIMO systems To speed up the computational requirements of the estimation algorithm, a tailored version of the Scaled Gradient Projection algorithm is designed to optimize the marginal likelihood. 2. CHARACTERIZATION OF UNCERTAINTY. The confidence sets returned by the non-parametric Bayesian identification algorithm are analyzed and compared with those returned by parametric Prediction Error Methods. The comparison is carried out in the impulse response space, by deriving “particle” versions (i.e. Monte-Carlo approximations) of the standard confidence sets. 3. ONLINE ESTIMATION. The application of the non-parametric Bayesian system identification techniques is extended to an online setting, in which new data become available as time goes. Specifically, two key modifications of the original “batch” procedure are proposed in order to meet the real-time requirements. In addition, the identification of time-varying systems is tackled by introducing a forgetting factor in the estimation criterion and by treating it as a hyper-parameter. 4. POST PROCESSING: MODEL REDUCTION. Non-parametric Bayesian identification procedures estimate the unknown system in terms of its impulse response coefficients, thus returning a model with high (possibly infinite) McMillan degree. A tailored procedure is proposed to reduce such model to a lower degree one, which appears more suitable for filtering and control applications. Different criteria for the selection of the order of the reduced model are evaluated and compared

    Similar works