14 research outputs found
Two view learning: SVM-2K, theory and practice
Kernel methods make it relatively easy to define complex highdimensional
feature spaces. This raises the question of how we can
identify the relevant subspaces for a particular learning task. When two
views of the same phenomenon are available kernel Canonical Correlation
Analysis (KCCA) has been shown to be an effective preprocessing
step that can improve the performance of classification algorithms such
as the Support Vector Machine (SVM). This paper takes this observation
to its logical conclusion and proposes a method that combines this
two stage learning (KCCA followed by SVM) into a single optimisation
termed SVM-2K. We present both experimental and theoretical analysis
of the approach showing encouraging results and insights
Integrative analysis of gene expression and copy number alterations using canonical correlation analysis
Supplementary Figure 1. Representation of the samples from the tuning set by their coordinates in the first two pairs of features (extracted from the tuning set) using regularized dual CCA, with regularization parameters tx = 0.9, ty = 0.3 (left panel), and PCA+CCA (right panel). We show the representations with respect to both the copy number features and the gene expression features in a superimposed way, where each sample is represented by two markers. The filled markers represent the coordinates in the features extracted from the copy number variables, and the open markers represent coordinates in the features extracted from the gene expression variables. Samples with different leukemia subtypes are shown with different colors. The first feature pair distinguishes the HD50 group from the rest, while the second feature pair represents the characteristics of the samples from the E2A/PBX1 subtype. The high canonical correlation obtained for the tuning samples with regularized dual CCA is apparent in the left panel, where the two points for each sample coincide. Nevertheless, the extracted features have a high generalization ability, as can be seen in the left panel of Figure 5, showing the representation of the validation samples. 1 Supplementary Figure 2. Representation of the samples from the tuning set by their coordinates in the first two pairs of features (extracted from the tuning set) using regularized dual CCA, with regularization parameters tx = 0, ty = 0 (left panel), and tx = 1, ty = 1 (right panel). We show the representations with respect to both the copy number features and the gene expression features in a superimposed way, where each sample is represented by tw