203 research outputs found
A new adaptive multiple modelling approach for non-linear and non-stationary systems
This paper proposes a novel adaptive multiple modelling algorithm for non-linear and non-stationary systems. This simple modelling paradigm comprises K candidate sub-models which are all linear. With data available in an online fashion, the performance of all candidate sub-models are monitored based on the most recent data window, and M best sub-models are selected from the K candidates. The weight coefficients of the selected sub-model are adapted via the recursive least square (RLS) algorithm, while the coefficients of the remaining sub-models are unchanged. These M model predictions are then optimally combined to produce the multi-model output. We propose to minimise the mean square error based on a recent data window, and apply the sum to one constraint to the combination parameters, leading to a closed-form solution, so that maximal computational efficiency can be achieved. In addition, at each time step, the model prediction is chosen from either the resultant multiple model or the best sub-model, whichever is the best. Simulation results are given in comparison with some typical alternatives, including the linear RLS algorithm and a number of online non-linear approaches, in terms of modelling performance and time consumption
Approximation errors of online sparsification criteria
Many machine learning frameworks, such as resource-allocating networks,
kernel-based methods, Gaussian processes, and radial-basis-function networks,
require a sparsification scheme in order to address the online learning
paradigm. For this purpose, several online sparsification criteria have been
proposed to restrict the model definition on a subset of samples. The most
known criterion is the (linear) approximation criterion, which discards any
sample that can be well represented by the already contributing samples, an
operation with excessive computational complexity. Several computationally
efficient sparsification criteria have been introduced in the literature, such
as the distance, the coherence and the Babel criteria. In this paper, we
provide a framework that connects these sparsification criteria to the issue of
approximating samples, by deriving theoretical bounds on the approximation
errors. Moreover, we investigate the error of approximating any feature, by
proposing upper-bounds on the approximation error for each of the
aforementioned sparsification criteria. Two classes of features are described
in detail, the empirical mean and the principal axes in the kernel principal
component analysis.Comment: 10 page
Analyzing sparse dictionaries for online learning with kernels
Many signal processing and machine learning methods share essentially the
same linear-in-the-parameter model, with as many parameters as available
samples as in kernel-based machines. Sparse approximation is essential in many
disciplines, with new challenges emerging in online learning with kernels. To
this end, several sparsity measures have been proposed in the literature to
quantify sparse dictionaries and constructing relevant ones, the most prolific
ones being the distance, the approximation, the coherence and the Babel
measures. In this paper, we analyze sparse dictionaries based on these
measures. By conducting an eigenvalue analysis, we show that these sparsity
measures share many properties, including the linear independence condition and
inducing a well-posed optimization problem. Furthermore, we prove that there
exists a quasi-isometry between the parameter (i.e., dual) space and the
dictionary's induced feature space.Comment: 10 page
Sparse least squares support vector regression for nonstationary systems
A new adaptive sparse least squares support vector regression algorithm, referred to as SLSSVR has been introduced for the adaptive modeling of nonstationary systems. Using a sliding window of recent data set of size N to track t he non-stationary characteristics of the incoming data, our adaptive model is initially formulated based on least squares support vector regression with forgetting factor (without bias term). In order to obtain a sparse model in which some parameters are exactly zeros, a l 1 penalty was applied in parameter estimation in the dual problem. Furthermore we exploit the fact that since the associated system/kernel matrix in positive definite, the dual solution of least squares support vector machine without bias term, can be solved iteratively with guaranteed convergence. Furthermore since the models between two consecutive time steps there are (N-1) shared kernels/parameters, the online solution can be obtained efficiently using coordinate descent algorithm in the form of Gauss-Seidel algorithm with minimal number of iterations. This allows a very sparse model per time step to be obtained very efficiently, avoiding expensive matrix inversion. The real stock market dataset and simulated examples have shown that the proposed approaches can lead to superior performances in comparison with the linear recursive least algorithm and a number of online non-linear approaches in terms of modelling performance and model size
Lazy learning in radial basis neural networks: A way of achieving more accurate models
Radial Basis Neural Networks have been successfully used in a large number of applications having in its rapid convergence time one of its most important advantages. However, the level of generalization is usually poor and very dependent on the quality of the training data because some of the training patterns can be redundant or irrelevant. In this paper, we present a learning method that automatically selects the training patterns more appropriate to the new sample to be approximated. This training method follows a lazy learning strategy, in the sense that it builds approximations centered around the novel sample. The proposed method has been applied to three different domains an artificial regression problem and two time series prediction problems. Results have been compared to standard training method using the complete training data set and the new method shows better generalization abilities.Publicad
The Cascade Orthogonal Neural Network
In the paper new non-conventional growing neural network is proposed. It coincides with the Cascade-
Correlation Learning Architecture structurally, but uses ortho-neurons as basic structure units, which can be
adjusted using linear tuning procedures. As compared with conventional approximating neural networks proposed
approach allows significantly to reduce time required for weight coefficients adjustment and the training dataset
size
Growing Neural Networks Using Nonconventional Activation Functions
In the paper, an ontogenic artificial neural network (ANNs) is proposed. The network uses orthogonal
activation functions that allow significant reducing of computational complexity. Another advantage is numerical
stability, because the system of activation functions is linearly independent by definition. A learning procedure for
proposed ANN with guaranteed convergence to the global minimum of error function in the parameter space is
developed. An algorithm for structure network structure adaptation is proposed. The algorithm allows adding or
deleting a node in real-time without retraining of the network. Simulation results confirm the efficiency of the
proposed approach
- …