Search CORE

66,054 research outputs found

Rainbow plots, Bagplots and Boxplots for Functional Data

Author: Han Lin Shang
Rob J. Hyndman
Publication venue
Publication date
Field of study

We propose new tools for visualizing large numbers of functional data in the form of smooth curves or surfaces. The proposed tools include functional versions of the bagplot and boxplot, and make use of the first two robust principal component scores, Tukey's data depth and highest density regions. By-products of our graphical displays are outlier detection methods for functional data. We compare these new outlier detection methods with exiting methods for detecting outliers in functional data and show that our methods are better able to identify the outliers.Highest density regions, Robust principal component analysis, Kernel density estimation, Outlier detection, Tukey's halfspace depth

Research Papers in Economics

Nonlinear process fault detection and identification using kernel PCA and kernel density estimation

Author: Cao Yi
Samuel Raphael
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2016
Field of study

Kernel principal component analysis (KPCA) is an effective and efficient technique for monitoring nonlinear processes. However, associating it with upper control limits (UCLs) based on the Gaussian distribution can deteriorate its performance. In this paper, the kernel density estimation (KDE) technique was used to estimate UCLs for KPCA-based nonlinear process monitoring. The monitoring performance of the resulting KPCA–KDE approach was then compared with KPCA, whose UCLs were based on the Gaussian distribution. Tests on the Tennessee Eastman process show that KPCA–KDE is more robust and provide better overall performance than KPCA with Gaussian assumption-based UCLs in both sensitivity and detection time. An efficient KPCA-KDE-based fault identification approach using complex step differentiation is also proposed

Cranfield CERES

Efficient online subspace learning with an indefinite kernel for visual tracking and recognition

Author: Liwicki Stephan
Pantic Maja
Tzimiropoulos Georgios
Zafeiriou Stefanos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/08/2012
Field of study

We propose an exact framework for online learning with a family of indefinite (not positive) kernels. As we study the case of nonpositive kernels, we first show how to extend kernel principal component analysis (KPCA) from a reproducing kernel Hilbert space to Krein space. We then formulate an incremental KPCA in Krein space that does not require the calculation of preimages and therefore is both efficient and exact. Our approach has been motivated by the application of visual tracking for which we wish to employ a robust gradient-based kernel. We use the proposed nonlinear appearance model learned online via KPCA in Krein space for visual tracking in many popular and difficult tracking scenarios. We also show applications of our kernel framework for the problem of face recognition

Nottingham ePrints

University of Lincoln Institutional Repository

Nottingham eTheses

Repository@Nottingham

University of Twente Research Information

Model Selection Techniques for Kernel-Based Regression Analysis Using Information Complexity Measure and Genetic Algorithms

Author: Zhang Rui
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2007
Field of study

In statistical modeling, an overparameterized model leads to poor generalization on unseen data points. This issue requires a model selection technique that appropriately chooses the form, the parameters of the proposed model and the independent variables retained for the modeling. Model selection is particularly important for linear and nonlinear statistical models, which can be easily overfitted. Recently, support vector machines (SVMs), also known as kernel-based methods, have drawn much attention as the next generation of nonlinear modeling techniques. The model selection issues for SVMs include the selection of the kernel, the corresponding parameters and the optimal subset of independent variables. In the current literature, k-fold cross-validation is the widely utilized model selection method for SVMs by the machine learning researchers. However, cross-validation is computationally intensive since one has to fit the model k times. This dissertation introduces the use of a model selection criterion based on information complexity (ICOMP) measure for kernel-based regression analysis and its applications. ICOMP penalizes both the lack-of-fit and the complexity of the model to choose the optimal model with good generalization properties. ICOMP provides a simple index for each model and does not require any validation data. It is computationally efficient and it has been successfully applied to various linear model selection problems. In this dissertation, we introduce ICOMP to the nonlinear kernel-based modeling areas. Specifically, this dissertation proposes ICOMP and its various forms in the area of kernel ridge regression; kernel partial least squares regression; kernel principal component analysis; kernel principal component regression; relevance vector regression; relevance vector logistic regression and classification problems. The model selection tasks achieved by our proposed criterion include choosing the form of the kernel function, the parameters of the kernel function, the ridge parameter, the number of latent variables, the number of principal components and the optimal subset of input variables in a simultaneous fashion for intelligent data mining. The performance of the proposed model selection method is tested on simulation bench- mark data sets as well as real data sets. The predictive performance of the proposed model selection criteria are comparable to and even better than cross-validation, which is too costly to compute and not efficient. This dissertation combines the Genetic Algorithm with ICOMP in variable subsetting, which significantly decreases the computational time as compared to the exhaustive search of all possible subsets. GA procedure is shown to be robust and performs well in our repeated simulation examples. Therefore, this dissertation provides researchers an alternative computationally efficient model selection approach for data analysis using kernel methods

University of Tennessee, Knoxville: Trace