1,342 research outputs found

    Component-based structural equation modelling

    Get PDF
    In this research, the authors explore the use of ULS-SEM (Structural-Equation-Modelling), PLS (Partial Least Squares), GSCA (Generalized Structured Component Analysis), path analysis on block principal components and path analysis on block scales on customer satisfaction data.Component-based SEM; covariance-based SEM; GSCA; path analysis; PLS path modelling; Structural Equation Modelling; Unweighted Least Squares

    Analyse canonique généralisée régularisée et approche PLS

    No full text
    International audienceNous donnons dans cette communication une définition de l'analyse canonique généralisée au niveau de la population (ACG-population) qui constitue le cadre théorique de l'approche PLS proposée par Herman Wold et à ses extensions proposées par Jan-Bernd Lohmöller et Nicole KrÀmer. En écrivant les équations stationnaires de l'ACG-population au niveau de l'échantillon et en utilisant des estimations régularisées (shrinkage estimations) des matrices de covariance des blocs, nous obtenons de nouvelles équations stationnaires au niveau de l'échantillon. Ces équations stationnaires sont également celles d'un problÚme d'optimisation que nous appelons analyse canonique généralisée régularisée (ACGR). En recherchant un point fixe de ces équations stationnaires au niveau de l'échantillon nous obtenons un algorithme trÚs similaire à l'approche PLS de Wold-Lohmöller-KrÀmer. De plus, nous démontrons la convergence monotone de l'algorithme proposé. Mots-clés: Analyse de tableaux multiples, Approche PLS, Analyse canonique généralisée régularisé

    A low variance consistent test of relative dependency

    Get PDF
    We describe a novel non-parametric statistical hypothesis test of relative dependence between a source variable and two candidate target variables. Such a test enables us to determine whether one source variable is significantly more dependent on a first target variable or a second. Dependence is measured via the Hilbert-Schmidt Independence Criterion (HSIC), resulting in a pair of empirical dependence measures (source-target 1, source-target 2). We test whether the first dependence measure is significantly larger than the second. Modeling the covariance between these HSIC statistics leads to a provably more powerful test than the construction of independent HSIC statistics by sub-sampling. The resulting test is consistent and unbiased, and (being based on U-statistics) has favorable convergence properties. The test can be computed in quadratic time, matching the computational complexity of standard empirical HSIC estimators. The effectiveness of the test is demonstrated on several real-world problems: we identify language groups from a multilingual corpus, and we prove that tumor location is more dependent on gene expression than chromosomal imbalances. Source code is available for download at https://github.com/wbounliphone/reldep.Comment: International Conference on Machine Learning, Jul 2015, Lille, Franc

    R&D Productivty: an International Study

    Get PDF
    The objective of this paper is to explore the impact of R&D expenditures on company performance. R&D activities play an essential role in the future economic development and financial performance of firms. However, with the exception of some American studies, the economic effectiveness of such investment is seldom demonstrated explicitly by the literature, and to the best of our knowledge, there are no existing studies on R&D productivity taking an international approach. Our research design is based on an earnings equation associating earnings with recorded assets, R&D expenditures and selling, general and administrative (SG&A) expenses (proxying advertising expenses). We determine a rate of return on R&D for each given sample of firms in 12 developed countries. Our results corroborate previous studies of American companies, which found that reported earnings, adjusted for the expensing of R&D, reflect realized benefits from R&D. This study confirms the positive contribution of R&D activities to future company performance, although this contribution can vary from one country to another.R&D productivity; R&D profitability; international study

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    Over-optimism in bioinformatics: an illustration

    Get PDF
    In statistical bioinformatics research, different optimization mechanisms potentially lead to "over-optimism" in published papers. The present empirical study illustrates these mechanisms through a concrete example from an active research field. The investigated sources of over-optimism include the optimization of the data sets, of the settings, of the competing methods and, most importantly, of the method’s characteristics. We consider a "promising" new classification algorithm that turns out to yield disappointing results in terms of error rate, namely linear discriminant analysis incorporating prior knowledge on gene functional groups through an appropriate shrinkage of the within-group covariance matrix. We quantitatively demonstrate that this disappointing method can artificially seem superior to existing approaches if we "fish for significance”. We conclude that, if the improvement of a quantitative criterion such as the error rate is the main contribution of a paper, the superiority of new algorithms should be validated using "fresh" validation data sets

    State-of-art on PLS Path Modeling through the available software

    Get PDF
    The purpose of this paper is to present PLS Path Modeling, to describe the various options of LVPLS 1.8 and PLS-Graph 3.0 for carrying out a path model, and to comment the output of both software. PLS-Graph 3.0 is actually based on LVPLS 1.8. As an added value, PLS-Graph has a very friendly graphical interface for drawing the model and a resampling module (jackknife and bootstrap). The presentation is illustrated by data which have been used to construct the European Consumer Satisfaction Index (ECSI) for a mobile phone provider.PLS Path Modeling; PLS Approach; Structural Equation Modeling; LVPLS 1.8; PLS-Graph

    Analyse Factorielle Discriminante Multi-voie

    No full text
    L'analyse factorielle discriminante est étendue aux données multi-voie, c'est-à-dire aux données pour lesquelles plusieurs modalités ont été observées pour chaque variable. Les données multi-voie sont ainsi structurées en tenseur. L'extension proposée repose sur une modélisation des axes discriminants. Cette modélisation prend en compte la structure tensorielle des données. Les gains attendus par rapport aux méthodes consistant à construire un classifieur à partir de la matrice obtenue par dépliement du tenseur, sont une meilleure interprétabilité et un meilleur comportement vis-à-vis du surapprentissage, phénomÚne d'autant plus présent dans le contexte multi-voie que le nombre de modalités est grand. Un algorithme de directions alternées permet d'obtenir les axes discriminants. Les performances obtenues sur données simulées permettent de confirmer ces gains
    • 

    corecore