5,760 research outputs found
Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods
Feature extraction and dimensionality reduction are important tasks in many
fields of science dealing with signal processing and analysis. The relevance of
these techniques is increasing as current sensory devices are developed with
ever higher resolution, and problems involving multimodal data sources become
more common. A plethora of feature extraction methods are available in the
literature collectively grouped under the field of Multivariate Analysis (MVA).
This paper provides a uniform treatment of several methods: Principal Component
Analysis (PCA), Partial Least Squares (PLS), Canonical Correlation Analysis
(CCA) and Orthonormalized PLS (OPLS), as well as their non-linear extensions
derived by means of the theory of reproducing kernel Hilbert spaces. We also
review their connections to other methods for classification and statistical
dependence estimation, and introduce some recent developments to deal with the
extreme cases of large-scale and low-sized problems. To illustrate the wide
applicability of these methods in both classification and regression problems,
we analyze their performance in a benchmark of publicly available data sets,
and pay special attention to specific real applications involving audio
processing for music genre prediction and hyperspectral satellite images for
Earth and climate monitoring
Supervised classification methods applied to airborne hyperspectral images: Comparative study using mutual information
Nowadays, the hyperspectral remote sensing imagery HSI becomes an important
tool to observe the Earth's surface, detect the climatic changes and many other
applications. The classification of HSI is one of the most challenging tasks
due to the large amount of spectral information and the presence of redundant
and irrelevant bands. Although great progresses have been made on
classification techniques, few studies have been done to provide practical
guidelines to determine the appropriate classifier for HSI. In this paper, we
investigate the performance of four supervised learning algorithms, namely,
Support Vector Machines SVM, Random Forest RF, K-Nearest Neighbors KNN and
Linear Discriminant Analysis LDA with different kernels in terms of
classification accuracies. The experiments have been performed on three real
hyperspectral datasets taken from the NASA's Airborne Visible/Infrared Imaging
Spectrometer Sensor AVIRIS and the Reflective Optics System Imaging
Spectrometer ROSIS sensors. The mutual information had been used to reduce the
dimensionality of the used datasets for better classification efficiency. The
extensive experiments demonstrate that the SVM classifier with RBF kernel and
RF produced statistically better results and seems to be respectively the more
suitable as supervised classifiers for the hyperspectral remote sensing images.
Keywords: hyperspectral images, mutual information, dimension reduction,
Support Vector Machines, K-Nearest Neighbors, Random Forest, Linear
Discriminant Analysis
Detecting Premature Ventricular Contraction by using Regulated Discriminant Analysis with very sparse training data
Pathological electrocardiogram is often used to diagnose abnormal cardiac disorders where accurate classification of the cardiac beat types is crucial for timely diagnosis of dangerous conditions. However, accurate, timely, and precise detection of arrhythmia-types like premature ventricular contraction is very challenging as these signals are multiform, i.e. a reliable detection of these requires expert annotations. In this paper, a multivariate statistical classifier that is able to detect premature ventricular contraction beats is presented. This novel classifier can be trained with a very sparse amount of expert annotated data. To enable this, the dimensionality of the feature vector is kept very low, it uses strong designed features and a regularization mechanism. This approach is compared to other classifiers by using the MIT-BIH arrhythmia database. It has been found that the average accuracy, specificity, and sensitivity are above 96%, which is superior given the sparse amount of training data
A Novel Hybrid Dimensionality Reduction Method using Support Vector Machines and Independent Component Analysis
Due to the increasing demand for high dimensional data analysis from various applications such as electrocardiogram signal analysis and gene expression analysis for cancer detection, dimensionality reduction becomes a viable process to extracts essential information from data such that the high-dimensional data can be represented in a more condensed form with much lower dimensionality to both improve classification accuracy and reduce computational complexity. Conventional dimensionality reduction methods can be categorized into stand-alone and hybrid approaches. The stand-alone method utilizes a single criterion from either supervised or unsupervised perspective. On the other hand, the hybrid method integrates both criteria. Compared with a variety of stand-alone dimensionality reduction methods, the hybrid approach is promising as it takes advantage of both the supervised criterion for better classification accuracy and the unsupervised criterion for better data representation, simultaneously. However, several issues always exist that challenge the efficiency of the hybrid approach, including (1) the difficulty in finding a subspace that seamlessly integrates both criteria in a single hybrid framework, (2) the robustness of the performance regarding noisy data, and (3) nonlinear data representation capability.
This dissertation presents a new hybrid dimensionality reduction method to seek projection through optimization of both structural risk (supervised criterion) from Support Vector Machine (SVM) and data independence (unsupervised criterion) from Independent Component Analysis (ICA). The projection from SVM directly contributes to classification performance improvement in a supervised perspective whereas maximum independence among features by ICA construct projection indirectly achieving classification accuracy improvement due to better intrinsic data representation in an unsupervised perspective. For linear dimensionality reduction model, I introduce orthogonality to interrelate both projections from SVM and ICA while redundancy removal process eliminates a part of the projection vectors from SVM, leading to more effective dimensionality reduction. The orthogonality-based linear hybrid dimensionality reduction method is extended to uncorrelatedness-based algorithm with nonlinear data representation capability. In the proposed approach, SVM and ICA are integrated into a single framework by the uncorrelated subspace based on kernel implementation.
Experimental results show that the proposed approaches give higher classification performance with better robustness in relatively lower dimensions than conventional methods for high-dimensional datasets
- …