196 research outputs found
Kernel Multivariate Analysis Framework for Supervised Subspace Learning: A Tutorial on Linear and Kernel Multivariate Methods
Feature extraction and dimensionality reduction are important tasks in many
fields of science dealing with signal processing and analysis. The relevance of
these techniques is increasing as current sensory devices are developed with
ever higher resolution, and problems involving multimodal data sources become
more common. A plethora of feature extraction methods are available in the
literature collectively grouped under the field of Multivariate Analysis (MVA).
This paper provides a uniform treatment of several methods: Principal Component
Analysis (PCA), Partial Least Squares (PLS), Canonical Correlation Analysis
(CCA) and Orthonormalized PLS (OPLS), as well as their non-linear extensions
derived by means of the theory of reproducing kernel Hilbert spaces. We also
review their connections to other methods for classification and statistical
dependence estimation, and introduce some recent developments to deal with the
extreme cases of large-scale and low-sized problems. To illustrate the wide
applicability of these methods in both classification and regression problems,
we analyze their performance in a benchmark of publicly available data sets,
and pay special attention to specific real applications involving audio
processing for music genre prediction and hyperspectral satellite images for
Earth and climate monitoring
Comparison of academic performance of twins and singletons in adolescence : follow-up study
Objectives To determine whether twins in recent
cohorts show similar academic performance in
adolescence to singletons and to test the effect of
birth weight on academic performance in twins and
singletons.
Design Follow-up study.
Setting Denmark.
Participants All twins (n = 3411) and a 5% random
sample of singletons (n = 7796) born in Denmark
during 1986-8.
Main outcome measures Test scores in ninth grade
(age 15 or 16), birth weight, gestational age at birth,
parentsâ age, and parentsâ education.
Results Ninth grade test scores were normally
distributed, with almost identical mean and standard
deviations for twins and singletons (8.02 v 8.02 and
1.05 v 1.06) despite the twins weighing on average
908 g (95% confidence interval 886 to 930 g) less
than the singletons at birth. Controlling for birth
weight, gestational age at birth, age at test, and
parentsâ age and education confirmed the similarity of
test scores for twins and singletons (difference 0.04,
95% confidence interval â 0.03 to 0.10). A significant,
positive association between test score and birth
weight was observed in both twins and singletons, but
the size of the effect was small: 0.06-0.12 standard
deviations for every kilogram increase in birth weight.
Conclusions Although older cohorts of twins have
been found to have lower mean IQ scores than
singletons, twins in recent Danish cohorts show
similar academic performance in adolescence to that
of singletons. Birth weight has a minimal effect on
academic performance in recent cohorts; for twins
this effect is best judged relative to what is a normal
birth weight for twins and not for singletons
Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music
In large MP3 databases, files are typically generated with different parameter settings, i.e., bit rate and sampling rates. This is of concern for MIR applications, as encoding difference can potentially confound meta-data estimation and similarity evaluation. In this paper we will discuss the influence of MP3 coding for the Mel frequency cepstral coeficients (MFCCs). The main result is that the widely used subset of the MFCCs is robust at bit rates equal or higher than 128 kbits/s, for the implementations we have investigated. However, for lower bit rates, e.g., 64 kbits/s, the implementation of the Mel filter bank becomes an issue
Sparse kernel orthonormalized PLS for feature extraction in large datasets
In this paper we are presenting a novel multivariate analysis method. Our scheme is based on a novel kernel orthonormalized partial least squares (PLS) variant for feature extraction, imposing sparsity constrains in the solution to improve scalability. The algorithm is tested on a benchmark of UCI data sets, and on the analysis of integrated short-time music features for genre prediction. The upshot is that the method has strong expressive power even with rather few features, is clearly outperforming the ordinary kernel PLS, and therefore is an appealing method for feature extraction of labelled data
A Genre Classification Plug-in for Data Collection
[TODO] Add abstract here
- âŚ