132,795 research outputs found

    On Measure Transformed Canonical Correlation Analysis

    Full text link
    In this paper linear canonical correlation analysis (LCCA) is generalized by applying a structured transform to the joint probability distribution of the considered pair of random vectors, i.e., a transformation of the joint probability measure defined on their joint observation space. This framework, called measure transformed canonical correlation analysis (MTCCA), applies LCCA to the data after transformation of the joint probability measure. We show that judicious choice of the transform leads to a modified canonical correlation analysis, which, in contrast to LCCA, is capable of detecting non-linear relationships between the considered pair of random vectors. Unlike kernel canonical correlation analysis, where the transformation is applied to the random vectors, in MTCCA the transformation is applied to their joint probability distribution. This results in performance advantages and reduced implementation complexity. The proposed approach is illustrated for graphical model selection in simulated data having non-linear dependencies, and for measuring long-term associations between companies traded in the NASDAQ and NYSE stock markets

    Asymptotic theory of multiple-set linear canonical analysis

    Full text link
    This paper deals with asymptotics for multiple-set linear canonical analysis (MSLCA). A definition of this analysis, that adapts the classical one to the context of Euclidean random variables, is given and properties of the related canonical coefficients are derived. Then, estimators of the MSLCA's elements, based on empirical covariance operators, are proposed and asymptotics for these estimators are obtained. More precisely, we prove their consistency and we obtain asymptotic normality for the estimator of the operator that gives MSLCA, and also for the estimator of the vector of canonical coefficients. These results are then used to obtain a test for mutual non-correlation between the involved Euclidean random variables

    Testing Dependence Among Serially Correlated Multi-category Variables

    Get PDF
    The contingency table literature on tests for dependence among discrete multi-category variables assume that draws are independent, and there are no tests that account for serial dependencies − a problem that is particularly important in economics and finance. This paper proposes a new test of independence based on the maximum canonical correlation between pairs of discrete variables. We also propose a trace canonical correlation test using dynamically augmented reduced rank regressions or an iterated weighting method in order to account for serial dependence. Such tests are useful, for example, when testing for predictability of one sequence of discrete random variables by means of another sequence of discrete random variables as in tests of market timing skills or business cycle analysis. The proposed tests allow for an arbitrary number of categories, are robust in the presence of serial dependencies and are simple to implement using multivariate regression methods

    Asymptotic study of canonical correlation analysis: from matrix and analytic approach to operator and tensor approach

    Get PDF
    Asymptotic study of canonical correlation analysis gives the opportunity to present the different steps of an asymptotic study and to show the interest of an operator and tensor approach of multidimensional asymptotic statistics rather than the classical, matrix and analytic approach. Using the last approach, Anderson (1999) assumes the random vectors to have a normal distribution and the non zero canonical correlation coefficients to be distinct. The new approach we use, Fine (2000), is coordinate-free, distribution-free and permits to have no restriction on the canonical correlation coefficients multiplicity order. Of course, when vectors have a normal distribution and when the non zero canonical correlation coefficients are distinct, it is possible to find again Anderson's results but we diverge on two of them. In this methodological presentation, we insist on the analysis frame (Dauxois and Pousse, 1976), the sampling model (Dauxois, Fine and Pousse, 1979) and the different mathematical tools (Fine, 1987, Dauxois, Romain and Viguier, 1994) which permit to solve problems encountered in this type of study, and even to obtain asymptotic behavior of the analyses random elements such as principal components and canonical variables.

    Testing Dependence among Serially Correlated Multi-category Variables

    Get PDF
    The contingency table literature on tests for dependence among discrete multi-category variables is extensive. Existing tests assume, however, that draws are independent, and there are no tests that account for serial dependencies−a problem that is particularly important in economics and finance. This paper proposes a new test of independence based on the maximum canonical correlation between pairs of discrete variables. We also propose a trace canonical correlation test using dynamically augmented reduced rank regressions or an iterated weighting method in order to account for serial dependence. Such tests are useful, for example, when testing for predictability of one sequence of discrete random variables by means of another sequence of discrete random variables as in tests of market timing skills or business cycle analysis. The proposed tests allow for an arbitrary number of categories, are robust in the presence of serial dependencies and are simple to implement using multivariate regression methods. Monte Carlo experiments show that the proposed tests have good finite sample properties. An empirical application to survey data on forecasts of GDP growth demonstrates the importance of correcting for serial dependencies in predictability tests.contingency tables, canonical correlations, serial dependence, tests of predictability

    Sparse CCA: Adaptive Estimation and Computational Barriers

    Get PDF
    Canonical correlation analysis is a classical technique for exploring the relationship between two sets of variables. It has important applications in analyzing high dimensional datasets originated from genomics, imaging and other fields. This paper considers adaptive minimax and computationally tractable estimation of leading sparse canonical coefficient vectors in high dimensions. First, we establish separate minimax estimation rates for canonical coefficient vectors of each set of random variables under no structural assumption on marginal covariance matrices. Second, we propose a computationally feasible estimator to attain the optimal rates adaptively under an additional sample size condition. Finally, we show that a sample size condition of this kind is needed for any randomized polynomial-time estimator to be consistent, assuming hardness of certain instances of the Planted Clique detection problem. The result is faithful to the Gaussian models used in the paper. As a byproduct, we obtain the first computational lower bounds for sparse PCA under the Gaussian single spiked covariance model
    • …
    corecore