352 research outputs found

    Asymptotically Efficient Quasi-Newton Type Identification with Quantized Observations Under Bounded Persistent Excitations

    Full text link
    This paper is concerned with the optimal identification problem of dynamical systems in which only quantized output observations are available under the assumption of fixed thresholds and bounded persistent excitations. Based on a time-varying projection, a weighted Quasi-Newton type projection (WQNP) algorithm is proposed. With some mild conditions on the weight coefficients, the algorithm is proved to be mean square and almost surely convergent, and the convergence rate can be the reciprocal of the number of observations, which is the same order as the optimal estimate under accurate measurements. Furthermore, inspired by the structure of the Cramer-Rao lower bound, an information-based identification (IBID) algorithm is constructed with adaptive design about weight coefficients of the WQNP algorithm, where the weight coefficients are related to the parameter estimates which leads to the essential difficulty of algorithm analysis. Beyond the convergence properties, this paper demonstrates that the IBID algorithm tends asymptotically to the Cramer-Rao lower bound, and hence is asymptotically efficient. Numerical examples are simulated to show the effectiveness of the information-based identification algorithm.Comment: 16 pages, 3 figures, submitted to Automatic

    A new kernel-based approach to system identification with quantized output data

    Full text link
    In this paper we introduce a novel method for linear system identification with quantized output data. We model the impulse response as a zero-mean Gaussian process whose covariance (kernel) is given by the recently proposed stable spline kernel, which encodes information on regularity and exponential stability. This serves as a starting point to cast our system identification problem into a Bayesian framework. We employ Markov Chain Monte Carlo methods to provide an estimate of the system. In particular, we design two methods based on the so-called Gibbs sampler that allow also to estimate the kernel hyperparameters by marginal likelihood maximization via the expectation-maximization method. Numerical simulations show the effectiveness of the proposed scheme, as compared to the state-of-the-art kernel-based methods when these are employed in system identification with quantized data.Comment: 10 pages, 4 figure

    Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions

    Full text link
    Recent advances in attention-free sequence models rely on convolutions as alternatives to the attention operator at the core of Transformers. In particular, long convolution sequence models have achieved state-of-the-art performance in many domains, but incur a significant cost during auto-regressive inference workloads -- naively requiring a full pass (or caching of activations) over the input sequence for each generated token -- similarly to attention-based models. In this paper, we seek to enable O(1)\mathcal O(1) compute and memory cost per token in any pre-trained long convolution architecture to reduce memory footprint and increase throughput during generation. Concretely, our methods consist in extracting low-dimensional linear state-space models from each convolution layer, building upon rational interpolation and model-order reduction techniques. We further introduce architectural improvements to convolution-based layers such as Hyena: by weight-tying the filters across channels into heads, we achieve higher pre-training quality and reduce the number of filters to be distilled. The resulting model achieves 10x higher throughput than Transformers and 1.5x higher than Hyena at 1.3B parameters, without any loss in quality after distillation

    Optimal asymptotic identification under bounded disturbances

    Get PDF
    Cover title.Includes bibliographical references (p. 35-36).Research supported by the NSERC fellowship from the government of Canada. Research supported by the NSF. ECS-8552419 Research supported by the ARO. DAAL03-86-K-0171David N.C. Tse, Munther A. Dahleh, John N. Tsitsiklis

    Tree-Structured Nonlinear Adaptive Signal Processing

    Get PDF
    In communication systems, nonlinear adaptive filtering has become increasingly popular in a variety of applications such as channel equalization, echo cancellation and speech coding. However, existing nonlinear adaptive filters such as polynomial (truncated Volterra series) filters and multilayer perceptrons suffer from a number of problems. First, although high Order polynomials can approximate complex nonlinearities, they also train very slowly. Second, there is no systematic and efficient way to select their structure. As for multilayer perceptrons, they have a very complicated structure and train extremely slowly Motivated by the success of classification and regression trees on difficult nonlinear and nonparametfic problems, we propose the idea of a tree-structured piecewise linear adaptive filter. In the proposed method each node in a tree is associated with a linear filter restricted to a polygonal domain, and this is done in such a way that each pruned subtree is associated with a piecewise linear filter. A training sequence is used to adaptively update the filter coefficients and domains at each node, and to select the best pruned subtree and the corresponding piecewise linear filter. The tree structured approach offers several advantages. First, it makes use of standard linear adaptive filtering techniques at each node to find the corresponding Conditional linear filter. Second, it allows for efficient selection of the subtree and the corresponding piecewise linear filter of appropriate complexity. Overall, the approach is computationally efficient and conceptually simple. The tree-structured piecewise linear adaptive filter bears some similarity to classification and regression trees. But it is actually quite different from a classification and regression tree. Here the terminal nodes are not just assigned a region and a class label or a regression value, but rather represent: a linear filter with restricted domain, It is also different in that classification and regression trees are determined in a batch mode offline, whereas the tree-structured adaptive filter is determined recursively in real-time. We first develop the specific structure of a tree-structured piecewise linear adaptive filter and derive a stochastic gradient-based training algorithm. We then carry out a rigorous convergence analysis of the proposed training algorithm for the tree-structured filter. Here we show the mean-square convergence of the adaptively trained tree-structured piecewise linear filter to the optimal tree-structured piecewise linear filter. Same new techniques are developed for analyzing stochastic gradient algorithms with fixed gains and (nonstandard) dependent data. Finally, numerical experiments are performed to show the computational and performance advantages of the tree-structured piecewise linear filter over linear and polynomial filters for equalization of high frequency channels with severe intersymbol interference, echo cancellation in telephone networks and predictive coding of speech signals

    Model Order Reduction Based on Semidefinite Programming

    Get PDF
    The main topic of this PhD thesis is complexity reduction of linear time-invariant models. The complexity in such systems is measured by the number of differential equations forming the dynamical system. This number is called the order of the system. Order reduction is typically used as a tool to model complex systems, the simulation of which takes considerable time and/or has overwhelming memory requirements. Any model reflects an approximation of a real world system. Therefore, it is reasonable to sacrifice some model accuracy in order to obtain a simpler representation. Once a low-order model is obtained, the simulation becomes computationally cheaper, which saves time and resources. A low-order model still has to be "similar" to the full order one in some sense. There are many ways of measuring "similarity" and, typically, such a measure is chosen depending on the application. Three different settings of model order reduction were investigated in the thesis. The first one is H infinity model order reduction, i.e., the distance between two models is measured by the H infinity norm. Although, the problem has been tackled by many researchers, all the optimal solutions are yet to be found. However, there are a large number of methods, which solve suboptimal problems and deliver accurate approximations. Recently, research community has devoted more attention to large-scale systems and computationally scalable extensions of existing model reduction techniques. The algorithm developed in the thesis is based on the frequency response samples matching. For a large class of systems the computation of the frequency response samples can be done very efficiently. Therefore, the developed algorithm is relatively computationally cheap. The proposed algorithm can be seen as a computationally scalable extension to the well-known Hankel model reduction, which is known to deliver very accurate solutions. One of the reasons for such an assessment is that the relaxation employed in the proposed algorithm is tightly related to the one used in Hankel model reduction. Numerical simulations also show that the accuracy of the method is comparable to the Hankel model reduction one. The second part of the thesis is devoted to parameterized model order reduction. A parameterized model is essentially a family of models which depend on certain design parameters. The model reduction goal in this setting is to approximate the whole family of models for all values of parameters. The main motivation for such a model reduction setting is design of a model with an appropriate set of parameters. In order to make a good choice of parameters, the models need to be simulated for a large set of parameters. After inspecting the simulation results a model can be picked with suitable frequency or step responses. Parameterized model reduction significantly simplifies this procedure. The proposed algorithm for parameterized model reduction is a straightforward extension of the one described above. The proposed algorithm is applicable to linear parameter-varying systems modeling as well. Finally, the third topic is modeling interconnections of systems. In this thesis an interconnection is a collection of systems (or subsystems) connected in a typical block-diagram. In order to avoid confusion, throughout the thesis the entire model is called a supersystem, as opposed to subsystems, which a supersystem consists of. One of the specific cases of structured model reduction is controller reduction. In this problem there are two subsystems: the plant and the controller. Two directions of model reduction of interconnected systems are considered: model reduction in the nu-gap metric and structured model reduction. To some extent, using the nu-gap metric makes it possible to model subsystems without considering the supersystem at all. This property can be exploited for extremely large supersystems for which some forms of analysis (evaluating stability, computing step response, etc.) are intractable. However, a more systematic way of modeling is structured model reduction. There, the objective is to approximate certain subsystems in such a way that crucial characteristics of the given supersystem, such as stability, structure of interconnections, frequency response, are preserved. In structured model reduction all subsystems are taken into account, not only the approximated ones. In order to address structured model reduction, the supersystem is represented in a coprime factor form, where its structure also appears in coprime factors. Using this representation the problem is reduced to H infinity model reduction, which is addressed by the presented framework. All the presented methods are validated on academic or known benchmark problems. Since all the methods are based on semidefinite programming, adding new constraints is a matter of formulating a constraint as a semidefinite one. A number of extensions are presented, which illustrate the power of the approach. Properties of the methods are discussed throughout the thesis while some remaining problems conclude the manuscript

    Advancing Process Control using Orthonormal Basis Functions

    Get PDF

    Advancing Process Control using Orthonormal Basis Functions

    Get PDF

    Algorithms for Blind Equalization Based on Relative Gradient and Toeplitz Constraints

    Get PDF
    Blind Equalization (BE) refers to the problem of recovering the source symbol sequence from a signal received through a channel in the presence of additive noise and channel distortion, when the channel response is unknown and a training sequence is not accessible. To achieve BE, statistical or constellation properties of the source symbols are exploited. In BE algorithms, two main concerns are convergence speed and computational complexity. In this dissertation, we explore the application of relative gradient for equalizer adaptation with a structure constraint on the equalizer matrix, for fast convergence without excessive computational complexity. We model blind equalization with symbol-rate sampling as a blind source separation (BSS) problem and study two single-carrier transmission schemes, specifically block transmission with guard intervals and continuous transmission. Under either scheme, blind equalization can be achieved using independent component analysis (ICA) algorithms with a Toeplitz or circulant constraint on the structure of the separating matrix. We also develop relative gradient versions of the widely used Bussgang-type algorithms. Processing the equalizer outputs in sliding blocks, we are able to use the relative gradient for adaptation of the Toeplitz constrained equalizer matrix. The use of relative gradient makes the Bussgang condition appear explicitly in the matrix adaptation and speeds up convergence. For the ICA-based and Bussgang-type algorithms with relative gradient and matrix structure constraints, we simplify the matrix adaptations to obtain equivalent equalizer vector adaptations for reduced computational cost. Efficient implementations with fast Fourier transform, and approximation schemes for the cross-correlation terms used in the adaptation, are shown to further reduce computational cost. We also consider the use of a relative gradient algorithm for channel shortening in orthogonal frequency division multiplexing (OFDM) systems. The redundancy of the cyclic prefix symbols is used to shorten a channel with a long impulse response. We show interesting preliminary results for a shortening algorithm based on relative gradient
    • …
    corecore