299,771 research outputs found

    A General Framework for Consistency of principal component analysis

    Get PDF
    published_or_final_versio

    Contributions to Functional Data Analysis With a Focus on Points of Impact in Functional Regression

    Get PDF
    The focus of this thesis is on the estimation of an unknown number of unknown specific locations (points of impact) at which a functional predictor has a specific effect on a real valued dependent variable. Two general setups are considered: chapter 1 is concerend with estimating points of impact in a functional linear regression model, chapter 2 is concerned with estimating points of impact in a generalized functional linear regression model framework. Besides consistency results for the introduced estimator of the points of impact and their number, both chapters contain results for the estimation of the remaining model parameters. The last chapter is concerned with the analysis of juggling cycles. A registration procedure is followed by a functional principal component analysis and the analysis of the corresponding eigenfunctions and scores

    Sparse PCA Asymptotics and Analysis of Tree Data

    Get PDF
    This research covers two major areas. The first one is asymptotic properties of Principal Component Analysis (PCA) and sparse PCA. The second one is the application of functional data analysis to tree structured data objects. A general asymptotic framework is developed for studying consistency properties of PCA. Assuming the spike population model, the framework considers increasing sample size, increasing dimension (or the number of variables) and increasing spike sizes (the relative size of the population eigenvalues). Our framework includes several previously studied domains of asymptotics as special cases, and for the first time allows one to investigate interesting connections and transitions among the various domains. This unication provides new theoretical insights. Sparse PCA methods are efficient tools to reduce the dimension (or number of variables) of complex data. Sparse principal components (PCs) can be easier to interpret than conventional PCs, because most loadings are zero. We study the asymptotic properties of these sparse PC directions for scenarios with fixed sample size and increasing dimension (i.e. High Dimension, Low Sample Size (HDLSS)). We find a large set of sparsity assumptions under which sparse PCA is still consistent even when conventional PCA is strongly inconsistent. The consistency of sparse PCA is characterized along with rates of convergence. The boundaries of the consistent region are clarified using an oracle result. Functional data analysis has been very successful in the analysis of data lying in standard Euclidean space, such as curve data. However, with recent developments in fields such as medical image analysis, more and more non-Euclidean spaces, such as tree-structured data, present great challenges to statistical analysis. Here, we use the Dyck path approach from probability theory to build a bridge between tree space and curve space to exploit the power of functional data analysis to analyze data in tree space.Doctor of Philosoph

    Functional Principal Component Analysis for Discretely Observed Functional Data and Sparse Fisher?s Discriminant Analysis with Thresholded Linear Constraints

    Get PDF
    We propose a new method to perform functional principal component analysis (FPCA) for discretely observed functional data by solving successive optimization problems. The new framework can be applied to both regularly and irregularly observed data, and to both dense and sparse data. Our method does not require estimates of the individual sample functions or the covariance functions. Hence, it can be used to analyze functional data with multidimensional arguments (e.g. random surfaces). Furthermore, it can be applied to many processes and models with complicated or nonsmooth covariance functions. In our method, smoothness of eigenfunctions is controlled by directly imposing roughness penalties on eigenfunctions, which makes it more efficient and flexible to tune the smoothness. Efficient algorithms for solving the successive optimization problems are proposed. We provide the existence and characterization of the solutions to the successive optimization problems. The consistency of our method is also proved. Through simulations, we demonstrate that our method performs well in the cases with smooth samples curves, with discontinuous sample curves and nonsmooth covariance and with sample functions having two dimensional arguments (random surfaces), repectively. We apply our method to classification problems of retinal pigment epithelial cells in eyes of mice and to longitudinal CD4 counts data. In the second part of this dissertation, we propose a sparse Fisher’s discriminant analysis method with thresholded linear constraints. Various regularized linear discriminant analysis (LDA) methods have been proposed to address the problems of the LDA in high-dimensional settings. Asymptotic optimality has been established for some of these methods when there are only two classes. A difficulty in the asymptotic study for the multiclass classification is that for the two-class classification, the classification boundary is a hyperplane and an explicit formula for the classification error exists, however, in the case of multiclass, the boundary is usually complicated and no explicit formula for the error generally exists. Another difficulty in proving the asymptotic consistency and optimality for sparse Fisher’s discriminant analysis is that the covariance matrix is involved in the constraints of the optimization problems for high order components. It is not easy to estimate a general high-dimensional covariance matrix. Thus, we propose a sparse Fisher’s discriminant analysis method which avoids the estimation of the covariance matrix, provide asymptotic consistency results and the corresponding convergence rates for all components. To prove the asymptotic optimality, we provide an asymptotic upper bound for a general linear classification rule in the case of muticlass which is applied to our method to obtain the asymptotic optimality and the corresponding convergence rate. In the special case of two classes, our method achieves the same as or better convergence rates compared to the existing method. The proposed method is applied to multivariate functional data with wavelet transformations

    Feature Selection for Functional Data

    Full text link
    In this paper we address the problem of feature selection when the data is functional, we study several statistical procedures including classification, regression and principal components. One advantage of the blinding procedure is that it is very flexible since the features are defined by a set of functions, relevant to the problem being studied, proposed by the user. Our method is consistent under a set of quite general assumptions, and produces good results with the real data examples that we analyze.Comment: 22 pages, 4 figure

    Intrinsic Inference on the Mean Geodesic of Planar Shapes and Tree Discrimination by Leaf Growth

    Full text link
    For planar landmark based shapes, taking into account the non-Euclidean geometry of the shape space, a statistical test for a common mean first geodesic principal component (GPC) is devised. It rests on one of two asymptotic scenarios, both of which are identical in a Euclidean geometry. For both scenarios, strong consistency and central limit theorems are established, along with an algorithm for the computation of a Ziezold mean geodesic. In application, this allows to verify the geodesic hypothesis for leaf growth of Canadian black poplars and to discriminate genetically different trees by observations of leaf shape growth over brief time intervals. With a test based on Procrustes tangent space coordinates, not involving the shape space's curvature, neither can be achieved.Comment: 28 pages, 4 figure

    Consistency of Skinner Box Activity Measures in the domestic Rabbit (Oryctolagus cuniculus)

    Get PDF
    Consistency of individual differences in several measures of Skinner box operant and other activity and their intercorrelations in 14 chinchilla bred rabbits were studied. Reliability analysis revealed that both operant and activity measures were highly consistent (Cronbach alpha>0.87) over at least 15 days. Furthermore, locomotor activity, the tendencies to press the lever with high frequency, to make many errors, to check the presence of food in the dispenser as well as rearing were highly inter-correlated, making up a single dimension of activity. However, grooming did not correlate with these behaviors
    • …
    corecore