4 research outputs found
Fast Convergence on Perfect Classification for Functional Data
In this study, we investigate the availability of approaching to perfect
classification on functional data with finite samples. The seminal work
(Delaigle and Hall (2012)) showed that classification on functional data is
easier to define on a perfect classifier than on finite-dimensional data. This
result is based on their finding that a sufficient condition for the existence
of a perfect classifier, named a Delaigle--Hall (DH) condition, is only
available for functional data. However, there is a danger that a large sample
size is required to achieve the perfect classification even though the DH
condition holds because a convergence of misclassification errors of functional
data is significantly slow. Specifically, a minimax rate of the convergence of
errors with functional data has a logarithm order in the sample size. This
study solves this complication by proving that the DH condition also achieves
fast convergence of the misclassification error in sample size. Therefore, we
study a classifier with empirical risk minimization using reproducing kernel
Hilbert space (RKHS) and analyse its convergence rate under the DH condition.
The result shows that the convergence speed of the misclassification error by
the RKHS classifier has an exponential order in sample size. Technically, the
proof is based on the following points: (i) connecting the DH condition and a
margin of classifiers, and (ii) handling metric entropy of functional data.
Experimentally, we validate that the DH condition and the associated margin
condition have a certain impact on the convergence rate of the RKHS classifier.
We also find that some of the other classifiers for functional data have a
similar property.Comment: 26 page
Simultaneous registration and modelling for multi-dimensional functional data
PhD ThesisFunctional data analysis (FDA) has many applications in almost every branch of science,
such as engineering, medicine and biology. It aims to cope with the analysis of data in
the form of images, curves and shapes. In this thesis, we study the 2D trajectories of
hyoid bone movement from X-ray image. Those curves are seen as the observations of
multi-dimensional functional data. We rstly develop an all-in-one platform for the data
acquisition and preprocessing. However, analyzing the data arises a lot of challenges. In
this thesis, we provide solutions to solve some of those challenging problems.
We propose one new registration method for handling those raw 2D curves. It basically
integrates Generalized Procrusts analysis and self-modelling registration method (GPSM).
However, the application reveals that the classi cation followed by registration does not
work well. Therefore, we propose two-stage functional models for joint curve registration
and classi cation (JCRC). In the rst stage, we use a functional logistic regression model
where the aligned curves are estimated from the second stage. The latter uses a nonlinear
warping function while modelling the 2D curves, i.e. resolving the misaligned problem
and modelling problem simultaneously. This two-stage model takes into account both the
scalar variables and the multi-dimensional functional data. For the functional data clustering,
we propose mixtures of Gaussian process functional regression with time warping
and logistic allocation model, allowing the use of both types of variables and also allowing
simultaneous registration and clustering (SRC). A two-level model is introduced. For the
data collected from subjects in di erent groups, a Gaussian process functional regression
model is used as the rst level model; an allocation model depending on scalar variables
is used as the second level model providing further information over the groups. Those
three methods, i.e., GPSM, JCRC and SRC are all examined on both simulated data and
real data
Wavelet-RKHS-based functional statistical classification
A functional classification methodology, based on the Reproducing Kernel Hilbert Space (RKHS) theory, is proposed for discrimination of gene expression profiles. The parameter function involved in the definition of the functional logistic regression is univocally and consistently estimated, from the minimization of the penalized negative log-likelihood over a RKHS generated by a suitable wavelet basis. An iterative descendent method, the gradient method, is applied for solving the corresponding minimization problem, i.e., for computing the functional estimate. Temporal gene expression data involved in the yeast cell cycle are classified with the wavelet-RKHS-based discrimination methodology considered. A simulation study is developed for testing the performance of this statistical classification methodology incomparison with other statistical discrimination procedures