184 research outputs found

    iPACOSE: an iterative algorithm for the estimation of gene regulation networks

    Get PDF
    In the context of Gaussian Graphical Models (GGMs) with high- dimensional small sample data, we present a simple procedure to esti- mate partial correlations under the constraint that some of them are strictly zero. This method can also be extended to covariance selection. If the goal is to estimate a GGM, our new procedure can be applied to re-estimate the partial correlations after a first graph has been esti- mated in the hope to improve the estimation of non-zero coefficients. In a simulation study, we compare our new covariance selection procedure to existing methods and show that the re-estimated partial correlation coefficients may be closer to the real values in important cases

    Use of pre-transformation to cope with outlying values in important candidate genes

    Get PDF
    Outlying values in predictors often strongly affect the results of statistical analyses in high-dimensional settings. Although they frequently occur with most high-throughput techniques, the problem is often ignored in the literature. We suggest to use a very simple transformation, proposed before in a different context by Royston and Sauerbrei, as an intermediary step between array normalization and high-level statistical analysis. This straightforward univariate transformation identifies extreme values and reduces the influence of outlying values considerably in all further steps of statistical analysis without eliminating the incriminated observation or feature. The use of the transformation and its effects are demonstrated for diverse univariate and multivariate statistical analyses using nine publicly available microarray data sets

    Iterative reconstruction of high-dimensional Gaussian Graphical Models based on a new method to estimate partial correlations under constraints.

    Get PDF
    In the context of Gaussian Graphical Models (GGMs) with high-dimensional small sample data, we present a simple procedure, called PACOSE - standing for PArtial COrrelation SElection - to estimate partial correlations under the constraint that some of them are strictly zero. This method can also be extended to covariance selection. If the goal is to estimate a GGM, our new procedure can be applied to re-estimate the partial correlations after a first graph has been estimated in the hope to improve the estimation of non-zero coefficients. This iterated version of PACOSE is called iPACOSE. In a simulation study, we compare PACOSE to existing methods and show that the re-estimated partial correlation coefficients may be closer to the real values in important cases. Plus, we show on simulated and real data that iPACOSE shows very interesting properties with regards to sensitivity, positive predictive value and stability

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    Over-optimism in bioinformatics: an illustration

    Get PDF
    In statistical bioinformatics research, different optimization mechanisms potentially lead to "over-optimism" in published papers. The present empirical study illustrates these mechanisms through a concrete example from an active research field. The investigated sources of over-optimism include the optimization of the data sets, of the settings, of the competing methods and, most importantly, of the method’s characteristics. We consider a "promising" new classification algorithm that turns out to yield disappointing results in terms of error rate, namely linear discriminant analysis incorporating prior knowledge on gene functional groups through an appropriate shrinkage of the within-group covariance matrix. We quantitatively demonstrate that this disappointing method can artificially seem superior to existing approaches if we "fish for significance”. We conclude that, if the improvement of a quantitative criterion such as the error rate is the main contribution of a paper, the superiority of new algorithms should be validated using "fresh" validation data sets

    Activated drying in hydrophobic nanopores and the line tension of water

    No full text
    International audienceWe study the slow dynamics of water evaporation out of hydro-phobic cavities by using model porous silica materials grafted with octylsilanes. The cylindrical pores are monodisperse, with a radius in the range of 1–2 nm. Liquid water penetrates in the nanopores at high pressure and empties the pores when the pressure is lowered. The drying pressure exhibits a logarithmic growth as a function of the driving rate over more than three decades, showing the ther-mally activated nucleation of vapor bubbles. We find that the slow dynamics and the critical volume of the vapor nucleus are quantita-tively described by the classical theory of capillarity without adjust-able parameter. However, classical capillarity utterly overestimates the critical bubble energy. We discuss the possible influence of surface heterogeneities, long-range interactions, and high-curvature effects, and we show that a classical theory can describe vapor nucleation provided that a negative line tension is taken into account. The drying pressure then provides a determination of this line tension with much higher precision than currently available methods. We find consistent values of the order of −30 pN in a variety of hydrophobic materials. drying transition | hydrophobicity | kinetics | nanobubbles

    Structure of the French farm-to-table surveillance system for Salmonella

    Get PDF
    The French surveillance system for Salmonella is based on a national system which can be traced back to 1947 for human cases and to the late 1980s for the main animal reservoirs. This system has evolved with regard to both European regulations and changes in the observed prevalence of Salmonella. European regulations establish a solid foundation on which to build an active harmonised surveillance system at the production level and for integrating data from the whole food chain. There are also passive surveillance networks in the agri-food and veterinary sectors and these allow complementary information to be obtained from other sectors or sources. The main strengths and weaknesses of these systems are described and a comparison of the different approaches is presented using a grid analysis. The results show that passive systems are very useful for detecting emerging or unusual events and for early warning of outbreaks. They also produce time series of cases or can determine the number of strains that should be used to assess the impact of interventions. Active surveillance data, due to their representativeness and reliability, are key elements in the application of risk analysis tools such as quantitative risk assessment or attribution. Thus, although data is collected and analysed by various organisations, these organisations all collaborate at a national level. Furthermore, their implication in European and international projects is effective and the main objectives of a surveillance system can be met

    Benchmarking clinical management of spinal and non-spinal disorders using quality of life: results from the EPI3-LASER survey in primary care

    Get PDF
    Concerns have been raised regarding sub-optimal utilization of analgesics and psychotropic drugs in the treatment of patients with chronic musculoskeletal disorders (MSDs) and their associated co-morbidities. The objective of this study was to describe drug prescriptions for the management of spinal and non-spinal MSDs contrasted against a standardized measure of quality of life. A representative population sample of 1,756 MSDs patients [38.5% with spinal disorder (SD) and 61.5% with non-spinal MSDs (NS-MSD)] was drawn from the EPI3-LASER survey of 825 general practitioners (GPs) in France. Physicians recorded their diagnoses and prescriptions on that day. Patients provided information on socio-demographics, lifestyle and quality of life using the Short Form 12 (SF-12) questionnaire. Chronicity of MSDs was defined as more than 12 weeks duration of the current episode. Chronic SD and NS-MSD patients were prescribed less analgesics and non-steroidal anti-inflammatory drugs than their non-chronic counterpart [odds ratios (OR) and 95% confidence intervals (CI), respectively: 0.4, 0.2–0.7 and 0.5, 0.3–0.6]. They also had more anxio-depressive co-morbidities reported by their physicians (SD: 16.1 vs.7.4%; NS-MSD: 21.6 vs. 9.5%) who prescribed more antidepressants and anxiolytics with a difference that was statistically significant only for spinal disorder patients (OR, 95% CI: 2.0, 1.1–3.6). Psychotropic drugs were more often prescribed in patients in the lower quartile of SF-12 mental score and prescriptions of analgesics in the lower quartile of SF-12 physical score (P < 0.001). In conclusion, anxiety and depressive disorders were commonly reported by GPs among chronic MSD patients. Their prescriptions of psychotropic and analgesic drugs were consistent with patients’ self-rated mental and physical health
    corecore