60,250 research outputs found

    Transformations for multivariate statistics

    Get PDF
    This paper derives transformations for multivariate statistics that eliminate asymptotic skewness, extending the results of Niki and Konishi (1986, Annals of the Institute of Statistical Mathematics 38, 371-383). Within the context of valid Edgeworth expansions for such statistics we first derive the set of equations that such a transformation must satisfy and second propose a local solution that is sufficient up to the desired order. Application of these results yields two useful corollaries. First, it is possible to eliminate the first correction term in an Edgeworth expansion, thereby accelerating convergence to the leading term normal approximation. Second, bootstrapping the transformed statistic can yield the same rate of convergence of the double, or prepivoted, bootstrap of Beran (1988, Journal of the American Statistical Association 83, 687-697), applied to the original statistic, implying a significant computational saving. The analytic results are illustrated by application to the family of exponential models, in which the transformation is seen to depend only upon the properties of the likelihood. The numerical properties are examined within a class of nonlinear regression models (logit, probit, Poisson, and exponential regressions), where the adequacy of the limiting normal and of the bootstrap (utilizing the k-step procedure of Andrews, 2002, Econometrica 70, 119-162) as distributional approximations is assessed

    Multivariate Statistics

    Get PDF
    This bachelor thesis provides an introduction to multivariate statistics, which is the analysis of data with multiple variables using statistical methods. The thesis focuses on the generalization of the normal distribution to random vectors, properties of the multivariate normal distribution, and non-parametric kernel estimation methods for estimating densities. Additionally, the thesis presents methods for separating populations and classifying new observations within these populations using multivariate statistics. The applications of these methods is demonstrated using a real data set, and the accuracy of the classification is evaluated

    Application of Random Matrix Theory to Multivariate Statistics

    Full text link
    This is an expository account of the edge eigenvalue distributions in random matrix theory and their application in multivariate statistics. The emphasis is on the Painlev\'e representations of these distributions

    PSYX 522.01: Multivariate Statistics

    Get PDF

    Topics In Multivariate Statistics

    Get PDF
    Multivariate statistics concerns the study of dependence relations among multiple variables of interest. Distinct from widely studied regression problems where one of the variables is singled out as a response, in multivariate analysis all variables are treated symmetrically and the dependency structures are examined, either for interest in its own right or for further analyses such as regressions. This thesis includes the study of three independent research problems in multivariate statistics. The first part of the thesis studies additive principal components (APCs for short), a nonlinear method useful for exploring additive relationships among a set of variables. We propose a shrinkage regularization approach for estimating APC transformations by casting the problem in the framework of reproducing kernel Hilbert spaces. To formulate the kernel APC problem, we introduce the Null Comparison Principle, a principle that ties the constraint in a multivariate problem to its criterion in a way that makes the goal of the multivariate method under study transparent. In addition to providing a detailed formulation and exposition of the kernel APC problem, we study asymptotic theory of kernel APCs. Our theory also motivates an iterative algorithm for computing kernel APCs. The second part of the thesis investigates the estimation of precision matrices in high dimensions when the data is corrupted in a cellwise manner and the uncontaminated data follows a multivariate normal distribution. It is known that in the setting of Gaussian graphical models, the conditional independence relations among variables is captured by the precision matrix of a multivariate normal distribution, and estimating the support of the precision matrix is equivalent to graphical model selection. In this work, we analyze the theoretical properties of robust estimators for precision matrices in high dimensions. The estimators we analyze are formed by plugging appropriately chosen robust covariance matrix estimators into the graphical Lasso and CLIME, two existing methods for high-dimensional precision matrix estimation. We establish error bounds for the precision matrix estimators that reveal the interplay between the dimensionality of the problem and the degree of contamination permitted in the observed distribution, and also analyze the breakdown point of both estimators. We also discuss implications of our work for Gaussian graphical model estimation in the presence of cellwise contamination. The third part of the thesis studies the problem of optimal estimation of a quadratic functional under the Gaussian two-sequence model. Quadratic functional estimation has been well studied under the Gaussian sequence model, and close connections between the problem of quadratic functional estimation and that of signal detection have been noted. Focusing on the estimation problem in the Gaussian two-sequence model, in this work we propose optimal estimators of the quadratic functional for different regimes and establish the minimax rates of convergence over a family of parameter spaces. The optimal rates exhibit interesting phase transition in this family. We also discuss the implications of our estimation results on the associated simultaneous signal detection problem

    An Integrated Approach to Determining the Risk of Overexploitation for Data-Poor Pelagic Atlantic Sharks

    Get PDF
    Assesses the risk of over-exploitation of sharks in Atlantic longline fisheries using three metrics based on multivariate statistics as a way to make management recommendations for species with limited data. Identifies species at higher risk

    Applied Multivariate Statistics with R

    Get PDF
    corecore