627 research outputs found
The wavelet-NARMAX representation : a hybrid model structure combining polynomial models with multiresolution wavelet decompositions
A new hybrid model structure combing polynomial models with multiresolution wavelet decompositions is introduced for nonlinear system identification. Polynomial models play an important role in approximation theory, and have been extensively used in linear and nonlinear system identification. Wavelet decompositions, in which the basis functions have the property of localization in both time and frequency, outperform many other approximation schemes and offer a flexible solution for approximating arbitrary functions. Although wavelet representations can approximate even severe nonlinearities in a given signal very well, the advantage of these representations can be lost when wavelets are used to capture linear or low-order nonlinear behaviour in a signal. In order to sufficiently utilise the global property of polynomials and the local property of wavelet representations simultaneously, in this study polynomial models and wavelet decompositions are combined together in a parallel structure to represent nonlinear input-output systems. As a special form of the NARMAX model, this hybrid model structure will be referred to as the WAvelet-NARMAX model, or simply WANARMAX. Generally, such a WANARMAX representation for an input-output system might involve a large number of basis functions and therefore a great number of model terms. Experience reveals that only a small number of these model terms are significant to the system output. A new fast orthogonal least squares algorithm, called the matching pursuit orthogonal least squares (MPOLS) algorithm, is also introduced in this study to determine which terms should be included in the final model
A Simple Iterative Algorithm for Parsimonious Binary Kernel Fisher Discrimination
By applying recent results in optimization theory variously known as optimization transfer or majorize/minimize algorithms, an algorithm for binary, kernel, Fisher discriminant analysis is introduced that makes use of a non-smooth penalty on the coefficients to provide a parsimonious solution. The problem is converted into a smooth optimization that can be solved iteratively with no greater overhead than iteratively re-weighted least-squares. The result is simple, easily programmed and is shown to perform, in terms of both accuracy and parsimony, as well as or better than a number of leading machine learning algorithms on two well-studied and substantial benchmarks
A unified wavelet-based modelling framework for non-linear system identification: the WANARX model structure
A new unified modelling framework based on the superposition of additive submodels, functional components, and
wavelet decompositions is proposed for non-linear system identification. A non-linear model, which is often represented
using a multivariate non-linear function, is initially decomposed into a number of functional components via the wellknown
analysis of variance (ANOVA) expression, which can be viewed as a special form of the NARX (non-linear
autoregressive with exogenous inputs) model for representing dynamic input–output systems. By expanding each functional
component using wavelet decompositions including the regular lattice frame decomposition, wavelet series and
multiresolution wavelet decompositions, the multivariate non-linear model can then be converted into a linear-in-theparameters
problem, which can be solved using least-squares type methods. An efficient model structure determination
approach based upon a forward orthogonal least squares (OLS) algorithm, which involves a stepwise orthogonalization
of the regressors and a forward selection of the relevant model terms based on the error reduction ratio (ERR), is
employed to solve the linear-in-the-parameters problem in the present study. The new modelling structure is referred to
as a wavelet-based ANOVA decomposition of the NARX model or simply WANARX model, and can be applied to
represent high-order and high dimensional non-linear systems
Efficient least angle regression for identification of linear-in-the-parameters models
Least angle regression, as a promising model selection method, differentiates itself from conventional stepwise and stagewise methods, in that it is neither too greedy nor too slow. It is closely related to L1 norm optimization, which has the advantage of low prediction variance through sacrificing part of model bias property in order to enhance model generalization capability. In this paper, we propose an efficient least angle regression algorithm for model selection for a large class of linear-in-the-parameters models with the purpose of accelerating the model selection process. The entire algorithm works completely in a recursive manner, where the correlations between model terms and residuals, the evolving directions and other pertinent variables are derived explicitly and updated successively at every subset selection step. The model coefficients are only computed when the algorithm finishes. The direct involvement of matrix inversions is thereby relieved. A detailed computational complexity analysis indicates that the proposed algorithm possesses significant computational efficiency, compared with the original approach where the well-known efficient Cholesky decomposition is involved in solving least angle regression. Three artificial and real-world examples are employed to demonstrate the effectiveness, efficiency and numerical stability of the proposed algorithm
Multivariate Analysis of Tumour Gene Expression Profiles Applying Regularisation and Bayesian Variable Selection Techniques
High-throughput microarray technology is here to stay, e.g. in oncology for tumour classification
and gene expression profiling to predict cancer pathology and clinical outcome. The global
objective of this thesis is to investigate multivariate methods that are suitable for this task.
After introducing the problem and the biological background, an overview of multivariate
regularisation methods is given in Chapter 3 and the binary classification problem is outlined
(Chapter 4). The focus of applications presented in Chapters 5 to 7 is on sparse binary classifiers
that are both parsimonious and interpretable. Particular emphasis is on sparse penalised
likelihood and Bayesian variable selection models, all in the context of logistic regression. The
thesis concludes with a final discussion chapter.
The variable selection problem is particularly challenging here, since the number of variables
is much larger than the sample size, which results in an ill-conditioned problem with
many equally good solutions. Thus, one open problem is the stability of gene expression profiles.
In a resampling study, various characteristics including stability are compared between a
variety of classifiers applied to five gene expression data sets and validated on two independent
data sets.
Bayesian variable selection provides an alternative to resampling for estimating the uncertainty
in the selection of genes. MCMC methods are used for model space exploration, but
because of the high dimensionality standard algorithms are computationally expensive and/or
result in poor Markov chain mixing. A novel MCMC algorithm is presented that uses the
dependence structure between input variables for finding blocks of variables to be updated together.
This drastically improves mixing while keeping the computational burden acceptable.
Several algorithms are compared in a simulation study. In an ovarian cancer application in
Chapter 7, the best-performing MCMC algorithms are combined with parallel tempering and
compared with an alternative method
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives
Part 2 of this monograph builds on the introduction to tensor networks and
their operations presented in Part 1. It focuses on tensor network models for
super-compressed higher-order representation of data/parameters and related
cost functions, while providing an outline of their applications in machine
learning and data analytics. A particular emphasis is on the tensor train (TT)
and Hierarchical Tucker (HT) decompositions, and their physically meaningful
interpretations which reflect the scalability of the tensor network approach.
Through a graphical approach, we also elucidate how, by virtue of the
underlying low-rank tensor approximations and sophisticated contractions of
core tensors, tensor networks have the ability to perform distributed
computations on otherwise prohibitively large volumes of data/parameters,
thereby alleviating or even eliminating the curse of dimensionality. The
usefulness of this concept is illustrated over a number of applied areas,
including generalized regression and classification (support tensor machines,
canonical correlation analysis, higher order partial least squares),
generalized eigenvalue decomposition, Riemannian optimization, and in the
optimization of deep neural networks. Part 1 and Part 2 of this work can be
used either as stand-alone separate texts, or indeed as a conjoint
comprehensive review of the exciting field of low-rank tensor networks and
tensor decompositions.Comment: 232 page
Employee Heterogeneity and Within-Firm Experience-Earnings Profiles: A Nonparametric Analysis
Abstract
Motivated by a priori uncertainty with respect to the parametric specification of
the earnings function, I model the earnings function as semiparametric partially
linear model and follow the estimation approach described in Robinson (1988).
Using data from the personnel records of a large major UK based financial
sector employer, I let years of within-firm and pre-firm experience form the
nonparametrically modelled component of the earnings function. It is shown that
the estimated within-firm experience earnings profiles, which are conditional
upon a given number years of pre-firm experience accumulated before entry,
converge and even overtake as years of pre-firm experience increases. This result
can be explained with the recognition of unobservable explanatory variables,
such as the match and individual quality of the employees, both of which are a
function of years of within- and pre-firm experience and wages
- …