4 research outputs found

    Fast Convergence on Perfect Classification for Functional Data

    Full text link
    In this study, we investigate the availability of approaching to perfect classification on functional data with finite samples. The seminal work (Delaigle and Hall (2012)) showed that classification on functional data is easier to define on a perfect classifier than on finite-dimensional data. This result is based on their finding that a sufficient condition for the existence of a perfect classifier, named a Delaigle--Hall (DH) condition, is only available for functional data. However, there is a danger that a large sample size is required to achieve the perfect classification even though the DH condition holds because a convergence of misclassification errors of functional data is significantly slow. Specifically, a minimax rate of the convergence of errors with functional data has a logarithm order in the sample size. This study solves this complication by proving that the DH condition also achieves fast convergence of the misclassification error in sample size. Therefore, we study a classifier with empirical risk minimization using reproducing kernel Hilbert space (RKHS) and analyse its convergence rate under the DH condition. The result shows that the convergence speed of the misclassification error by the RKHS classifier has an exponential order in sample size. Technically, the proof is based on the following points: (i) connecting the DH condition and a margin of classifiers, and (ii) handling metric entropy of functional data. Experimentally, we validate that the DH condition and the associated margin condition have a certain impact on the convergence rate of the RKHS classifier. We also find that some of the other classifiers for functional data have a similar property.Comment: 26 page

    Simultaneous registration and modelling for multi-dimensional functional data

    Get PDF
    PhD ThesisFunctional data analysis (FDA) has many applications in almost every branch of science, such as engineering, medicine and biology. It aims to cope with the analysis of data in the form of images, curves and shapes. In this thesis, we study the 2D trajectories of hyoid bone movement from X-ray image. Those curves are seen as the observations of multi-dimensional functional data. We rstly develop an all-in-one platform for the data acquisition and preprocessing. However, analyzing the data arises a lot of challenges. In this thesis, we provide solutions to solve some of those challenging problems. We propose one new registration method for handling those raw 2D curves. It basically integrates Generalized Procrusts analysis and self-modelling registration method (GPSM). However, the application reveals that the classi cation followed by registration does not work well. Therefore, we propose two-stage functional models for joint curve registration and classi cation (JCRC). In the rst stage, we use a functional logistic regression model where the aligned curves are estimated from the second stage. The latter uses a nonlinear warping function while modelling the 2D curves, i.e. resolving the misaligned problem and modelling problem simultaneously. This two-stage model takes into account both the scalar variables and the multi-dimensional functional data. For the functional data clustering, we propose mixtures of Gaussian process functional regression with time warping and logistic allocation model, allowing the use of both types of variables and also allowing simultaneous registration and clustering (SRC). A two-level model is introduced. For the data collected from subjects in di erent groups, a Gaussian process functional regression model is used as the rst level model; an allocation model depending on scalar variables is used as the second level model providing further information over the groups. Those three methods, i.e., GPSM, JCRC and SRC are all examined on both simulated data and real data

    Wavelet-RKHS-based functional statistical classification

    No full text
    A functional classification methodology, based on the Reproducing Kernel Hilbert Space (RKHS) theory, is proposed for discrimination of gene expression profiles. The parameter function involved in the definition of the functional logistic regression is univocally and consistently estimated, from the minimization of the penalized negative log-likelihood over a RKHS generated by a suitable wavelet basis. An iterative descendent method, the gradient method, is applied for solving the corresponding minimization problem, i.e., for computing the functional estimate. Temporal gene expression data involved in the yeast cell cycle are classified with the wavelet-RKHS-based discrimination methodology considered. A simulation study is developed for testing the performance of this statistical classification methodology incomparison with other statistical discrimination procedures
    corecore