3,932 research outputs found

    iPACOSE: an iterative algorithm for the estimation of gene regulation networks

    Get PDF
    In the context of Gaussian Graphical Models (GGMs) with high- dimensional small sample data, we present a simple procedure to esti- mate partial correlations under the constraint that some of them are strictly zero. This method can also be extended to covariance selection. If the goal is to estimate a GGM, our new procedure can be applied to re-estimate the partial correlations after a first graph has been esti- mated in the hope to improve the estimation of non-zero coefficients. In a simulation study, we compare our new covariance selection procedure to existing methods and show that the re-estimated partial correlation coefficients may be closer to the real values in important cases

    Use of pre-transformation to cope with outlying values in important candidate genes

    Get PDF
    Outlying values in predictors often strongly affect the results of statistical analyses in high-dimensional settings. Although they frequently occur with most high-throughput techniques, the problem is often ignored in the literature. We suggest to use a very simple transformation, proposed before in a different context by Royston and Sauerbrei, as an intermediary step between array normalization and high-level statistical analysis. This straightforward univariate transformation identifies extreme values and reduces the influence of outlying values considerably in all further steps of statistical analysis without eliminating the incriminated observation or feature. The use of the transformation and its effects are demonstrated for diverse univariate and multivariate statistical analyses using nine publicly available microarray data sets

    Iterative reconstruction of high-dimensional Gaussian Graphical Models based on a new method to estimate partial correlations under constraints.

    Get PDF
    In the context of Gaussian Graphical Models (GGMs) with high-dimensional small sample data, we present a simple procedure, called PACOSE - standing for PArtial COrrelation SElection - to estimate partial correlations under the constraint that some of them are strictly zero. This method can also be extended to covariance selection. If the goal is to estimate a GGM, our new procedure can be applied to re-estimate the partial correlations after a first graph has been estimated in the hope to improve the estimation of non-zero coefficients. This iterated version of PACOSE is called iPACOSE. In a simulation study, we compare PACOSE to existing methods and show that the re-estimated partial correlation coefficients may be closer to the real values in important cases. Plus, we show on simulated and real data that iPACOSE shows very interesting properties with regards to sensitivity, positive predictive value and stability

    Cooperation, the power of a single word. Some experimental evidence on wording and gender effects in a Game of Chicken

    Get PDF
    Wording has been widely shown to affect decision making. In this paper, we investigate experimentally whether and to what extent, cooperative behaviour in a Game of Chicken may be impated by a very basic change in the labelling of the strategies. Our within-subject experimental design involves two treatments. The only difference between them is that we introduce either a socially-oriented wording (‘I cooperate'/‘I do not cooperate') or colours (red/blue) to designate strategies. The level of cooperation appears to be higher in the socially-oriented context, but only when the uncertainty as regards the type of the partner is manipulated, and especially among females.Social dilemma, Game of Chicken, cooperation, wording effects, gender effects.

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    Stochastic Viability of Second Generation Biofuel Chains: Micro-economic Spatial Modeling in France

    Get PDF
    Within an overall project to assess the ability of the agricultural sector to contribute to bioenergy production, we set out here to examine the economic and technological viability of a bioenergy facility in an uncertain economic context, using the stochastic viability approach. We consider two viability constraints: the facility demand for lignocellulosic feedstock has to be satisfied each year and the associated supply cost has to be lower than de profitability threshold of the facility. We assess the viability probability of various supplying strategies consisting in contracting a given share of the feedstock demand with perennial dedicated crops at the initial time and then in making up each year with annual dedicated crops or wood. The demand constraints and agricultural prices scenarios over the time horizon are introduced in an agricultural and forest biomass supply model, which in turns determines the supply cost per MWh and computes the viability probabilities of the various contract strategies. A sensibility analysis to agricultural prices at initial time is performed. Results show that when they are around or under the median (of the 1993–2007 prices), the strategy consisting in contracting 100% of the feedstock supply with perennial dedicated crops is the best one.Biofuel, Biomass production, Spatial economics, Stochastic viability, Monte Carlo simulation, Resource /Energy Economics and Policy,

    Over-optimism in bioinformatics: an illustration

    Get PDF
    In statistical bioinformatics research, different optimization mechanisms potentially lead to "over-optimism" in published papers. The present empirical study illustrates these mechanisms through a concrete example from an active research field. The investigated sources of over-optimism include the optimization of the data sets, of the settings, of the competing methods and, most importantly, of the method’s characteristics. We consider a "promising" new classification algorithm that turns out to yield disappointing results in terms of error rate, namely linear discriminant analysis incorporating prior knowledge on gene functional groups through an appropriate shrinkage of the within-group covariance matrix. We quantitatively demonstrate that this disappointing method can artificially seem superior to existing approaches if we "fish for significance”. We conclude that, if the improvement of a quantitative criterion such as the error rate is the main contribution of a paper, the superiority of new algorithms should be validated using "fresh" validation data sets