394 research outputs found

    Integrating biological knowledge into variable selection : an empirical Bayes approach with an application in cancer biology

    Get PDF
    Background: An important question in the analysis of biochemical data is that of identifying subsets of molecular variables that may jointly influence a biological response. Statistical variable selection methods have been widely used for this purpose. In many settings, it may be important to incorporate ancillary biological information concerning the variables of interest. Pathway and network maps are one example of a source of such information. However, although ancillary information is increasingly available, it is not always clear how it should be used nor how it should be weighted in relation to primary data. Results: We put forward an approach in which biological knowledge is incorporated using informative prior distributions over variable subsets, with prior information selected and weighted in an automated, objective manner using an empirical Bayes formulation. We employ continuous, linear models with interaction terms and exploit biochemically-motivated sparsity constraints to permit exact inference. We show an example of priors for pathway- and network-based information and illustrate our proposed method on both synthetic response data and by an application to cancer drug response data. Comparisons are also made to alternative Bayesian and frequentist penalised-likelihood methods for incorporating network-based information. Conclusions: The empirical Bayes method proposed here can aid prior elicitation for Bayesian variable selection studies and help to guard against mis-specification of priors. Empirical Bayes, together with the proposed pathway-based priors, results in an approach with a competitive variable selection performance. In addition, the overall procedure is fast, deterministic, and has very few user-set parameters, yet is capable of capturing interplay between molecular players. The approach presented is general and readily applicable in any setting with multiple sources of biological prior knowledge

    Causal Learning via Manifold Regularization.

    Get PDF
    This paper frames causal structure estimation as a machine learning task. The idea is to treat indicators of causal relationships between variables as 'labels' and to exploit available data on the variables of interest to provide features for the labelling task. Background scientific knowledge or any available interventional data provide labels on some causal relationships and the remainder are treated as unlabelled. To illustrate the key ideas, we develop a distance-based approach (based on bivariate histograms) within a manifold regularization framework. We present empirical results on three different biological data sets (including examples where causal effects can be verified by experimental intervention), that together demonstrate the efficacy and general nature of the approach as well as its simplicity from a user's point of view

    Molecular heterogeneity at the network level: high-dimensional testing, clustering and a TCGA case study.

    Get PDF
    MOTIVATION: Molecular pathways and networks play a key role in basic and disease biology. An emerging notion is that networks encoding patterns of molecular interplay may themselves differ between contexts, such as cell type, tissue or disease (sub)type. However, while statistical testing of differences in mean expression levels has been extensively studied, testing of network differences remains challenging. Furthermore, since network differences could provide important and biologically interpretable information to identify molecular subgroups, there is a need to consider the unsupervised task of learning subgroups and networks that define them. This is a nontrivial clustering problem, with neither subgroups nor subgroup-specific networks known at the outset. RESULTS: We leverage recent ideas from high-dimensional statistics for testing and clustering in the network biology setting. The methods we describe can be applied directly to most continuous molecular measurements and networks do not need to be specified beforehand. We illustrate the ideas and methods in a case study using protein data from The Cancer Genome Atlas (TCGA). This provides evidence that patterns of interplay between signalling proteins differ significantly between cancer types. Furthermore, we show how the proposed approaches can be used to learn subtypes and the molecular networks that define them. AVAILABILITY AND IMPLEMENTATION: As the Bioconductor package nethet. CONTACT: [email protected] or [email protected]. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    High-dimensional regression in practice: an empirical study of finite-sample prediction, variable selection and ranking

    Get PDF
    Abstract: Penalized likelihood approaches are widely used for high-dimensional regression. Although many methods have been proposed and the associated theory is now well developed, the relative efficacy of different approaches in finite-sample settings, as encountered in practice, remains incompletely understood. There is therefore a need for empirical investigations in this area that can offer practical insight and guidance to users. In this paper, we present a large-scale comparison of penalized regression methods. We distinguish between three related goals: prediction, variable selection and variable ranking. Our results span more than 2300 data-generating scenarios, including both synthetic and semisynthetic data (real covariates and simulated responses), allowing us to systematically consider the influence of various factors (sample size, dimensionality, sparsity, signal strength and multicollinearity). We consider several widely used approaches (Lasso, Adaptive Lasso, Elastic Net, Ridge Regression, SCAD, the Dantzig Selector and Stability Selection). We find considerable variation in performance between methods. Our results support a “no panacea” view, with no unambiguous winner across all scenarios or goals, even in this restricted setting where all data align well with the assumptions underlying the methods. The study allows us to make some recommendations as to which approaches may be most (or least) suitable given the goal and some data characteristics. Our empirical results complement existing theory and provide a resource to compare methods across a range of scenarios and metrics

    Boundary States and Black Hole Entropy

    Full text link
    Black hole entropy is derived from a sum over boundary states. The boundary states are labeled by energy and momentum surface densities, and parametrized by the boundary metric. The sum over state labels is expressed as a functional integral with measure determined by the density of states. The sum over metrics is expressed as a functional integral with measure determined by the universal expression for the inverse temperature gradient at the horizon. The analysis applies to any stationary, nonextreme black hole in any theory of gravitational and matter fields.Comment: 4 pages, Revte

    Relationship between clinical signs and postmortem test status in cattle experimentally infected with the bovine spongiform encephalopathy agent

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Various clinical protocols have been developed to aid in the clinical diagnosis of classical bovine spongiform encephalopathy (BSE), which is confirmed by postmortem examinations based on vacuolation and accumulation of disease-associated prion protein (PrP<sup>d</sup>) in the brain. The present study investigated the occurrence and progression of sixty selected clinical signs and behaviour combinations in 513 experimentally exposed cattle subsequently categorised postmortem as confirmed or unconfirmed BSE cases. Appropriate undosed or saline inoculated controls were examined similarly and the data analysed to explore the possible occurrence of BSE-specific clinical expression in animals unconfirmed by postmortem examinations.</p> <p>Results</p> <p>Based on the display of selected behavioural, sensory and locomotor changes, 20 (67%) orally dosed and 17 (77%) intracerebrally inoculated pathologically confirmed BSE cases and 21 (13%) orally dosed and 18 (6%) intracerebrally inoculated but unconfirmed cases were considered clinical BSE suspects. None of 103 controls showed significant signs and were all negative on diagnostic postmortem examinations. Signs indicative of BSE suspects, particularly over-reactivity and ataxia, were more frequently displayed in confirmed cases with vacuolar changes in the brain. The display of several BSE-associated signs over time, including repeated startle responses and nervousness, was significantly more frequent in confirmed BSE cases compared to controls, but these two signs were also significantly more frequent in orally dosed cattle unconfirmed by postmortem examinations.</p> <p>Conclusions</p> <p>The findings confirm that in experimentally infected cattle clinical abnormalities indicative of BSE are accompanied by vacuolar changes and PrP<sup>d </sup>accumulation in the brainstem. The presence of more frequently expressed signs in cases with vacuolar changes is consistent with this pathology representing a more advanced stage of disease. That BSE-like signs or sign combinations occur in inoculated animals that were not confirmed as BSE cases by postmortem examinations requires further study to investigate the potential causal relationship with prion disease.</p
    corecore