4,982 research outputs found

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    SHrinkage Covariance Estimation Incorporating Prior Biological Knowledge with Applications to High-Dimensional Data

    Get PDF
    In ``-omic data'' analysis, information on the structure of covariates are broadly available either from public databases describing gene regulation processes and functional groups such as the Kyoto encyclopedia of genes and genomes (KEGG), or from statistical analyses -- for example in form of partial correlation estimators. The analysis of transcriptomic data might benefit from the incorporation of such prior knowledge. In this paper we focus on the integration of structured information into statistical analyses in which at least one major step involves the estimation of a (high-dimensional) covariance matrix. More precisely, we revisit the recently proposed ``SHrinkage Incorporating Prior'' (SHIP) covariance estimation method which takes into account the group structure of the covariates, and suggest to integrate the SHIP covariance estimator into various multivariate methods such as linear discriminant analysis (LDA), global analysis of covariance (GlobalANCOVA), and regularized generalized canonical correlation analysis (RGCCA). We demonstrate the use of the resulting new methods based on simulations and discuss the benefit of the integration of prior information through the SHIP estimator. Reproducible R codes are available at http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/shipproject/index.html

    Over-optimism in bioinformatics: an illustration

    Get PDF
    In statistical bioinformatics research, different optimization mechanisms potentially lead to "over-optimism" in published papers. The present empirical study illustrates these mechanisms through a concrete example from an active research field. The investigated sources of over-optimism include the optimization of the data sets, of the settings, of the competing methods and, most importantly, of the method’s characteristics. We consider a "promising" new classification algorithm that turns out to yield disappointing results in terms of error rate, namely linear discriminant analysis incorporating prior knowledge on gene functional groups through an appropriate shrinkage of the within-group covariance matrix. We quantitatively demonstrate that this disappointing method can artificially seem superior to existing approaches if we "fish for significance”. We conclude that, if the improvement of a quantitative criterion such as the error rate is the main contribution of a paper, the superiority of new algorithms should be validated using "fresh" validation data sets

    The Performance of MLEM for Dynamic Imaging From Simulated Few-View, Multi-Pinhole SPECT

    Get PDF
    Stationary small-animal SPECT systems are being developed for rapid dynamic imaging from limited angular views. This work quantified, through simulations, the performance of Maximum Likelihood Expectation Maximization (MLEM) for reconstructing a time-activity curve (TAC) with uptake duration of a few seconds from a stationary, three-camera multi-pinhole SPECT system. The study also quantified the benefits of a heuristic method of initializing the reconstruction with a prior image reconstructed from a conventional number of views, for example from data acquired during the late-study portion of the dynamic TAC. We refer to MLEM reconstruction initialized by a prior-image initial guess (IG) as MLEMig. The effect of the prior-image initial guess on the depiction of contrast between two regions of a static phantom was quantified over a range of angular sampling schemes. A TAC was modeled from the experimentally measured uptake of 99mTc-hexamethylpropyleneamine oxime (HMPAO) in the rat lung. The resulting time series of simulated images was quantitatively analyzed with respect to the accuracy of the estimated exponential washin and washout parameters. In both static and dynamic phantom studies, the prior-image initial guess improved the spatial depiction of the phantom, for example improved definition of the cylinder boundaries and more accurate quantification of relative contrast between cylinders. For example in the dynamic study, there was ~ 50% error in relative contrast for MLEM reconstructions compared to ~ 25-30% error for MLEMig. In the static phantom study, the benefits of the initial guess decreased as the number of views increased. The prior-image initial guess introduced an additive offset in the reconstructed dynamic images, likely due to biases introduced by the prior image. MLEM initialized with a uniform initial guess yielded images that faithfully reproduced the time dependence of the simulated TAC; there were no s- atistically significant differences in the mean exponential washin/washout parameters estimated from MLEM reconstructions compared to the true values. Washout parameters estimated from MLEMig reconstructions did not differ significantly from the true values, however the estimated washin parameter differed significantly from the true value in some cases. Overall, MLEM reconstruction from few views and a uniform initial guess accurately quantified the time dependance of the TAC while introducing errors in the spatial depiction of the object. Initializing the reconstruction with a late-study initial guess improved spatial accuracy while decreasing temporal accuracy in some cases

    Acoustic signatures of the seafloor: Tools for predicting grouper habitat

    Get PDF
    Groupers are important components of commercial and recreational fisheries. Current methods of diver-based grouper census surveys could potentially benefit from development of remotely sensed methods of seabed classification. The goal of the present study was to determine if areas of high grouper abundance have characteristic acoustic signatures. A commercial acoustic seabed mapping system, QTC View Series V, was used to survey an area near Carysfort Reef, Florida Keys. Acoustic data were clustered using QTC IMPACT software, resulting in three main acoustic classes covering 94% of the area surveyed. Diver-based data indicate that one of the acoustic classes corresponded to hard substrate and the other two represented sediment. A new measurement of seabed heterogeneity, designated acoustic variability, was also computed from the acoustic survey data in order to more fully characterize the acoustic response (i.e., the signature) of the seafloor. When compared with diver-based grouper census data, both acoustic classification and acoustic variability were significantly different at sites with and without groupers. Sites with groupers were characterized by hard bottom substrate and high acoustic variability. Thus, the acoustic signature of a site, as measured by acoustic classification or acoustic variability, is a potentially useful tool for stratifying diver sampling effort for grouper census
    corecore