227 research outputs found

    Accuracy of Gene Scores when Pruning Markers by Linkage Disequilibrium.

    Get PDF
    OBJECTIVE: Gene scores are often used to model the combined effects of genetic variants. When variants are in linkage disequilibrium, it is common to prune all variants except the most strongly associated. This avoids duplicating information but discards information when variants have independent effects. However, joint modelling of correlated variants increases the sampling error in the gene score. In recent applications, joint modelling has offered only small improvements in accuracy over pruning. We aimed to quantify the relationship between pruning and joint modelling in relation to sample size. METHODS: We derived the coefficient of determination R2 for a gene score constructed from pruned markers, and for one constructed from correlated markers with jointly estimated effects. RESULTS: Pruned scores tend to have slightly lower R2 than jointly modelled scores, but the differences are small at sample sizes up to 100,000. If the proportion of correlated variants is high, joint modelling can obtain modest improvements asymptotically. CONCLUSIONS: The small gains observed to date from joint modelling can be explained by sample size. As studies become larger, joint modelling will be useful for traits affected by many correlated variants, but the improvements may remain small. Pruning remains a useful heuristic for current studies

    JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects.

    Get PDF
    Recently, large scale genome-wide association study (GWAS) meta-analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one-at-a-time. This complicates the ability of fine-mapping to identify a small set of SNPs for further functional follow-up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re-analysis of published marginal summary statistics under joint multi-SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi-region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta-analysis of glucose and insulin related traits consortium) - a GWAS meta-analysis of more than 15,000 people. We re-analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index

    Tailored Bayes: a risk modeling framework under unequal misclassification costs.

    Get PDF
    Risk prediction models are a crucial tool in healthcare. Risk prediction models with a binary outcome (i.e., binary classification models) are often constructed using methodology which assumes the costs of different classification errors are equal. In many healthcare applications, this assumption is not valid, and the differences between misclassification costs can be quite large. For instance, in a diagnostic setting, the cost of misdiagnosing a person with a life-threatening disease as healthy may be larger than the cost of misdiagnosing a healthy person as a patient. In this article, we present Tailored Bayes (TB), a novel Bayesian inference framework which "tailors" model fitting to optimize predictive performance with respect to unbalanced misclassification costs. We use simulation studies to showcase when TB is expected to outperform standard Bayesian methods in the context of logistic regression. We then apply TB to three real-world applications, a cardiac surgery, a breast cancer prognostication task, and a breast cancer tumor classification task and demonstrate the improvement in predictive performance over standard methods

    A flexible and parallelizable approach to genome-wide polygenic risk scores.

    Get PDF
    The heritability of most complex traits is driven by variants throughout the genome. Consequently, polygenic risk scores, which combine information on multiple variants genome-wide, have demonstrated improved accuracy in genetic risk prediction. We present a new two-step approach to constructing genome-wide polygenic risk scores from meta-GWAS summary statistics. Local linkage disequilibrium (LD) is adjusted for in Step 1, followed by, uniquely, long-range LD in Step 2. Our algorithm is highly parallelizable since block-wise analyses in Step 1 can be distributed across a high-performance computing cluster, and flexible, since sparsity and heritability are estimated within each block. Inference is obtained through a formal Bayesian variable selection framework, meaning final risk predictions are averaged over competing models. We compared our method to two alternative approaches: LDPred and lassosum using all seven traits in the Welcome Trust Case Control Consortium as well as meta-GWAS summaries for type 1 diabetes (T1D), coronary artery disease, and schizophrenia. Performance was generally similar across methods, although our framework provided more accurate predictions for T1D, for which there are multiple heterogeneous signals in regions of both short- and long-range LD. With sufficient compute resources, our method also allows the fastest runtimes

    Development and External Validation of Prediction Models for 10-Year Survival of Invasive Breast Cancer. Comparison with PREDICT and CancerMath.

    Get PDF
    Purpose: To compare PREDICT and CancerMath, two widely used prognostic models for invasive breast cancer, taking into account their clinical utility. Furthermore, it is unclear whether these models could be improved.Experimental Design: A dataset of 5,729 women was used for model development. A Bayesian variable selection algorithm was implemented to stochastically search for important interaction terms among the predictors. The derived models were then compared in three independent datasets (n = 5,534). We examined calibration, discrimination, and performed decision curve analysis.Results: CancerMath demonstrated worse calibration performance compared with PREDICT in estrogen receptor (ER)-positive and ER-negative tumors. The decline in discrimination performance was -4.27% (-6.39 to -2.03) and -3.21% (-5.9 to -0.48) for ER-positive and ER-negative tumors, respectively. Our new models matched the performance of PREDICT in terms of calibration and discrimination, but offered no improvement. Decision curve analysis showed predictions for all models were clinically useful for treatment decisions made at risk thresholds between 5% and 55% for ER-positive tumors and at thresholds of 15% to 60% for ER-negative tumors. Within these threshold ranges, CancerMath provided the lowest clinical utility among all the models.Conclusions: Survival probabilities from PREDICT offer both improved accuracy and discrimination over CancerMath. Using PREDICT to make treatment decisions offers greater clinical utility than CancerMath over a range of risk thresholds. Our new models performed as well as PREDICT, but no better, suggesting that, in this setting, including further interaction terms offers no predictive benefit. Clin Cancer Res; 24(9); 2110-5. ©2018 AACR

    Insight into Genotype-Phenotype Associations through eQTL Mapping in Multiple Cell Types in Health and Immune-Mediated Disease

    Get PDF
    Genome-wide association studies (GWAS) have transformed our understanding of the genetics of complex traits such as autoimmune diseases, but how risk variants contribute to pathogenesis remains largely unknown. Identifying genetic variants that affect gene expression (expression quantitative trait loci, or eQTLs) is crucial to addressing this. eQTLs vary between tissues and following in vitro cellular activation, but have not been examined in the context of human inflammatory diseases. We performed eQTL mapping in five primary immune cell types from patients with active inflammatory bowel disease (n = 91), anti-neutrophil cytoplasmic antibody-associated vasculitis (n = 46) and healthy controls (n = 43), revealing eQTLs present only in the context of active inflammatory disease. Moreover, we show that following treatment a proportion of these eQTLs disappear. Through joint analysis of expression data from multiple cell types, we reveal that previous estimates of eQTL immune cell-type specificity are likely to have been exaggerated. Finally, by analysing gene expression data from multiple cell types, we find eQTLs not previously identified by database mining at 34 inflammatory bowel disease-associated loci. In summary, this parallel eQTL analysis in multiple leucocyte subsets from patients with active disease provides new insights into the genetic basis of immune-mediated diseases.This research was funded by a Wellcome Trust Clinical PhD Programme Fellowship (JEP), the NIH-Oxford-Cambridge Scholars Program (ACR), Wellcome Trust Grant 083650/Z/07/Z and MRC Grant MR/L19027/1 (KGCS), and the National Institute for Health Research Cambridge Biomedical Research Centre. KGCS is a National Institute for Health Research Senior Investigator. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript

    SLAM++: Simultaneous Localisation and Mapping at the Level of Objects

    Full text link
    We present the major advantages of a new ‘object ori-ented ’ 3D SLAM paradigm, which takes full advantage in the loop of prior knowledge that many scenes consist of repeated, domain-specific objects and structures. As a hand-held depth camera browses a cluttered scene, real-time 3D object recognition and tracking provides 6DoF camera-object constraints which feed into an explicit graph of objects, continually refined by efficient pose-graph opti-misation. This offers the descriptive and predictive power of SLAM systems which perform dense surface reconstruc-tion, but with a huge representation compression. The ob-ject graph enables predictions for accurate ICP-based cam-era to model tracking at each live frame, and efficient ac-tive search for new objects in currently undescribed image regions. We demonstrate real-time incremental SLAM in large, cluttered environments, including loop closure, relo-calisation and the detection of moved objects, and of course the generation of an object level scene description with the potential to enable interaction. 1

    Calnexin is necessary for T cell transmigration into the central nervous system

    Get PDF
    In multiple sclerosis (MS), a demyelinating inflammatory disease of the CNS, and its animal model (experimental autoimmune encephalomyelitis; EAE), circulating immune cells gain access to the CNS across the blood-brain barrier to cause inflammation, myelin destruction, and neuronal damage. Here, we discovered that calnexin, an ER chaperone, is highly abundant in human brain endothelial cells of MS patients. Conversely, mice lacking calnexin exhibited resistance to EAE induction, no evidence of immune cell infiltration into the CNS, and no induction of inflammation markers within the CNS. Furthermore, calnexin deficiency in mice did not alter the development or function of the immune system. Instead, the loss of calnexin led to a defect in brain endothelial cell function that resulted in reduced T cell trafficking across the blood-brain barrier. These findings identify calnexin in brain endothelial cells as a potentially novel target for developing strategies aimed at managing or preventing the pathogenic cascade that drives neuroinflammation and destruction of the myelin sheath in MS
    • …
    corecore