145 research outputs found

    Smoking Gun or Circumstantial Evidence? Comparison of Statistical Learning Methods using Functional Annotations for Prioritizing Risk Variants

    Get PDF
    Although technology has triumphed in facilitating routine genome sequencing, new challenges have been created for the data-analyst. Genome-scale surveys of human variation generate volumes of data that far exceed capabilities for laboratory characterization. By incorporating functional annotations as predictors, statistical learning has been widely investigated for prioritizing genetic variants likely to be associated with complex disease. We compared three published prioritization procedures, which use different statistical learning algorithms and different predictors with regard to the quantity, type and coding. We also explored different combinations of algorithm and annotation set. As an application, we tested which methodology performed best for prioritizing variants using data from a large schizophrenia meta-analysis by the Psychiatric Genomics Consortium. Results suggest that all methods have considerable (and similar) predictive accuracies (AUCs 0.64-0.71) in test set data, but there is more variability in the application to the schizophrenia GWAS. In conclusion, a variety of algorithms and annotations seem to have a similar potential to effectively enrich true risk variants in genome-scale datasets, however none offer more than incremental improvement in prediction. We discuss how methods might be evolved for risk variant prediction to address the impending bottleneck of the new generation of genome re-sequencing studies

    An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks

    Get PDF
    Background: Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn). Results: We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices. Conclusions: The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful

    Frontotemporal dementia: insights into the biological underpinnings of disease through gene co-expression network analysis

    Get PDF
    BACKGROUND: In frontotemporal dementia (FTD) there is a critical lack in the understanding of biological and molecular mechanisms involved in disease pathogenesis. The heterogeneous genetic features associated with FTD suggest that multiple disease-mechanisms are likely to contribute to the development of this neurodegenerative condition. We here present a systems biology approach with the scope of i) shedding light on the biological processes potentially implicated in the pathogenesis of FTD and ii) identifying novel potential risk factors for FTD. We performed a gene co-expression network analysis of microarray expression data from 101 individuals without neurodegenerative diseases to explore regional-specific co-expression patterns in the frontal and temporal cortices for 12 genes (MAPT, GRN, CHMP2B, CTSC, HLA-DRA, TMEM106B, C9orf72, VCP, UBQLN2, OPTN, TARDBP and FUS) associated with FTD and we then carried out gene set enrichment and pathway analyses, and investigated known protein-protein interactors (PPIs) of FTD-genes products. RESULTS: Gene co-expression networks revealed that several FTD-genes (such as MAPT and GRN, CTSC and HLA-DRA, TMEM106B, and C9orf72, VCP, UBQLN2 and OPTN) were clustering in modules of relevance in the frontal and temporal cortices. Functional annotation and pathway analyses of such modules indicated enrichment for: i) DNA metabolism, i.e. transcription regulation, DNA protection and chromatin remodelling (MAPT and GRN modules); ii) immune and lysosomal processes (CTSC and HLA-DRA modules), and; iii) protein meta/catabolism (C9orf72, VCP, UBQLN2 and OPTN, and TMEM106B modules). PPI analysis supported the results of the functional annotation and pathway analyses. CONCLUSIONS: This work further characterizes known FTD-genes and elaborates on their biological relevance to disease: not only do we indicate likely impacted regional-specific biological processes driven by FTD-genes containing modules, but also do we suggest novel potential risk factors among the FTD-genes interactors as targets for further mechanistic characterization in hypothesis driven cell biology work

    Genome-Scale Methods Converge on Key Mitochondrial Genes for the Survival of Human Cardiomyocytes in Hypoxia

    Get PDF
    BACKGROUND: Any reduction in myocardial oxygen delivery relative to its demands can impair cardiac contractile performance. Understanding the mitochondrial metabolic response to hypoxia is key to understanding ischemia tolerance in the myocardium. We used a novel combination of 2 genome-scale methods to study key processes underlying human myocardial hypoxia tolerance. In particular, we hypothesized that computational modeling and evolution would identify similar genes as critical to human myocardial hypoxia tolerance. METHODS AND RESULTS: We analyzed a reconstruction of the cardiac mitochondrial metabolic network using constraint-based methods, under conditions of simulated hypoxia. We used flux balance analysis, random sampling, and principal component analysis to explore feasible steady-state solutions. Hypoxia blunted maximal ATP (−17%) and heme (−75%) synthesis and shrank the feasible solution space. Tricarboxylic acid and urea cycle fluxes were also reduced in hypoxia, but phospholipid synthesis was increased. Using mathematical optimization methods, we identified reactions that would be critical to hypoxia tolerance in the human heart. We used data regarding single-nucleotide polymorphism frequency and distribution in the genomes of Tibetans (whose ancestors have resided in persistent high-altitude hypoxia for several millennia). Six reactions were identified by both methods as being critical to mitochondrial ATP production in hypoxia: phosphofructokinase, phosphoglucokinase, complex II, complex IV, aconitase, and fumarase. CONCLUSIONS: Mathematical optimization and evolution converged on similar genes as critical to human myocardial hypoxia tolerance. Our approach is unique and completely novel and demonstrates that genome-scale modeling and genomics can be used in tandem to provide new insights into cardiovascular genetics

    Bibliometrics of systematic reviews : analysis of citation rates and journal impact factors

    Get PDF
    Background: Systematic reviews are important for informing clinical practice and health policy. The aim of this study was to examine the bibliometrics of systematic reviews and to determine the amount of variance in citations predicted by the journal impact factor (JIF) alone and combined with several other characteristics. Methods: We conducted a bibliometric analysis of 1,261 systematic reviews published in 2008 and the citations to them in the Scopus database from 2008 to June 2012. Potential predictors of the citation impact of the reviews were examined using descriptive, univariate and multiple regression analysis. Results: The mean number of citations per review over four years was 26.5 (SD +/-29.9) or 6.6 citations per review per year. The mean JIF of the journals in which the reviews were published was 4.3 (SD +/-4.2). We found that 17% of the reviews accounted for 50% of the total citations and 1.6% of the reviews were not cited. The number of authors was correlated with the number of citations (r = 0.215, P =5.16) received citations in the bottom quartile (eight or fewer), whereas 9% of reviews published in the lowest JIF quartile (<=2.06) received citations in the top quartile (34 or more). Six percent of reviews in journals with no JIF were also in the first quartile of citations. Conclusions: The JIF predicted over half of the variation in citations to the systematic reviews. However, the distribution of citations was markedly skewed. Some reviews in journals with low JIFs were well-cited and others in higher JIF journals received relatively few citations; hence the JIF did not accurately represent the number of citations to individual systematic reviews

    Iron Age and Anglo-Saxon genomes from East England reveal British migration history

    Get PDF
    British population history has been shaped by a series of immigrations, including the early Anglo-Saxon migrations after 400 CE. It remains an open question how these events affected the genetic composition of the current British population. Here, we present whole-genome sequences from 10 individuals excavated close to Cambridge in the East of England, ranging from the late Iron Age to the middle Anglo-Saxon period. By analysing shared rare variants with hundreds of modern samples from Britain and Europe, we estimate that on average the contemporary East English population derives 38% of its ancestry from Anglo-Saxon migrations. We gain further insight with a new method, rarecoal, which infers population history and identifies fine-scale genetic ancestry from rare variants. Using rarecoal we find that the Anglo-Saxon samples are closely related to modern Dutch and Danish populations, while the Iron Age samples share ancestors with multiple Northern European populations including Britain

    Integrated Polygenic Tool Substantially Enhances Coronary Artery Disease Prediction

    Get PDF
    BACKGROUND: There is considerable interest in whether genetic data can be used to improve standard cardiovascular disease risk calculators, as the latter are routinely used in clinical practice to manage preventative treatment. METHODS: Using the UK Biobank resource, we developed our own polygenic risk score for coronary artery disease (CAD). We used an additional 60 000 UK Biobank individuals to develop an integrated risk tool (IRT) that combined our polygenic risk score with established risk tools (either the American Heart Association/American College of Cardiology pooled cohort equations [PCE] or UK QRISK3), and we tested our IRT in an additional, independent set of 186 451 UK Biobank individuals. RESULTS: The novel CAD polygenic risk score shows superior predictive power for CAD events, compared with other published polygenic risk scores, and is largely uncorrelated with PCE and QRISK3. When combined with PCE into an IRT, it has superior predictive accuracy. Overall, 10.4% of incident CAD cases were misclassified as low risk by PCE and correctly classified as high risk by the IRT, compared with 4.4% misclassified by the IRT and correctly classified by PCE. The overall net reclassification improvement for the IRT was 5.9% (95% CI, 4.7–7.0). When individuals were stratified into age-by-sex subgroups, the improvement was larger for all subgroups (range, 8.3%–15.4%), with the best performance in 40- to 54-year-old men (15.4% [95% CI, 11.6–19.3]). Comparable results were found using a different risk tool (QRISK3) and also a broader definition of cardiovascular disease. Use of the IRT is estimated to avoid up to 12 000 deaths in the United States over a 5-year period. CONCLUSIONS: An IRT that includes polygenic risk outperforms current risk stratification tools and offers greater opportunity for early interventions. Given the plummeting costs of genetic tests, future iterations of CAD risk tools would be enhanced with the addition of a person’s polygenic risk

    Statistical Power of Model Selection Strategies for Genome-Wide Association Studies

    Get PDF
    Genome-wide association studies (GWAS) aim to identify genetic variants related to diseases by examining the associations between phenotypes and hundreds of thousands of genotyped markers. Because many genes are potentially involved in common diseases and a large number of markers are analyzed, it is crucial to devise an effective strategy to identify truly associated variants that have individual and/or interactive effects, while controlling false positives at the desired level. Although a number of model selection methods have been proposed in the literature, including marginal search, exhaustive search, and forward search, their relative performance has only been evaluated through limited simulations due to the lack of an analytical approach to calculating the power of these methods. This article develops a novel statistical approach for power calculation, derives accurate formulas for the power of different model selection strategies, and then uses the formulas to evaluate and compare these strategies in genetic model spaces. In contrast to previous studies, our theoretical framework allows for random genotypes, correlations among test statistics, and a false-positive control based on GWAS practice. After the accuracy of our analytical results is validated through simulations, they are utilized to systematically evaluate and compare the performance of these strategies in a wide class of genetic models. For a specific genetic model, our results clearly reveal how different factors, such as effect size, allele frequency, and interaction, jointly affect the statistical power of each strategy. An example is provided for the application of our approach to empirical research. The statistical approach used in our derivations is general and can be employed to address the model selection problems in other random predictor settings. We have developed an R package markerSearchPower to implement our formulas, which can be downloaded from the Comprehensive R Archive Network (CRAN) or http://bioinformatics.med.yale.edu/group/

    Oxford Phase 3 unicompartmental knee arthroplasty: medium-term results of a minimally invasive surgical procedure

    Get PDF
    PURPOSE: In the last decade, a major increase in the use of and interest in unicompartmental knee arthroplasty (UKA) has developed. The Oxford Phase 3 UKA is implanted with a minimally invasive technique using newly developed instruments. The objective of this prospective study was to evaluate the outcome of UKA in patients with medial osteoarthritis of the knee in a high-volume unit. METHODS: Two-hundred and forty-four UKAs were performed with a minimally invasive approach. The median age was 72 (43-91) years. The median follow-up was 4.2 years (range 1-10.4 years). Fourteen patients died, and nine were considered to be lost to follow-up, but all had a well-functioning prosthesis in situ until their last follow-up. Pain, function and health-related quality of life were evaluated pre- and postoperatively using patient- and assessor-based outcome scores, as well as radiographic evidence. RESULTS: The mean Knee Society knee and function scores, WOMAC-scores, Oxford-score and VAS pain and satisfaction all improved. Nine knees required revision. Eleven patients required an additional arthroscopic procedure due to persisting pain secondary to intra-articular pathology, and four patients required manipulation under anaesthesia because of limited range of motion. The 7-year cumulative survival rate of the arthroplasty was 94.4%. A low incidence (21%) of a radiolucent line beneath the tibial component was observed at 5 years of follow-up. CONCLUSION: This study showed a high survival rate of the Oxford Phase 3 UKA. Patient satisfaction and functional performance were also very high. Major complication rate was low; in addition, the incidence of radiolucency under the tibial component, when compared to present literature, was low. When strict indication criteria are followed, excellent, durable, and in our opinion reliable, results can be expected for this procedur

    A Systematic Mapping Approach of 16q12.2/FTO and BMI in More Than 20,000 African Americans Narrows in on the Underlying Functional Variation: Results from the Population Architecture using Genomics and Epidemiology (PAGE) Study

    Get PDF
    Genetic variants in intron 1 of the fat mass- and obesity-associated (FTO) gene have been consistently associated with body mass index (BMI) in Europeans. However, follow-up studies in African Americans (AA) have shown no support for some of the most consistently BMI-associated FTO index single nucleotide polymorphisms (SNPs). This is most likely explained by different race-specific linkage disequilibrium (LD) patterns and lower correlation overall in AA, which provides the opportunity to fine-map this region and narrow in on the functional variant. To comprehensively explore the 16q12.2/FTO locus and to search for second independent signals in the broader region, we fine-mapped a 646-kb region, encompassing the large FTO gene and the flanking gene RPGRIP1L by investigating a total of 3,756 variants (1,529 genotyped and 2,227 imputed variants) in 20,488 AAs across five studies. We observed associations between BMI and variants in the known FTO intron 1 locus: the SNP with the most significant p-value, rs56137030 (8.3×10-6) had not been highlighted in previous studies. While rs56137030was correlated at r2>0.5 with 103 SNPs in Europeans (including the GWAS index SNPs), this number was reduced to 28 SNPs in AA. Among rs56137030 and the 28 correlated SNPs, six were located within candidate intronic regulatory elements, including rs1421085, for which we predicted allele-specific binding affinity for the transcription factor CUX1, which has recently been implicated in the regulation of FTO. We did not find strong evidence for a second independent signal in the broader region. In summary, this large fine-mapping study in AA has substantially reduced the number of common alleles that are likely to be functional candidates of the known FTO locus. Importantly our study demonstrated that comprehensive fine-mapping in AA provides a powerful approach to narrow in on the functional candidate(s) underlying the initial GWAS findings in European populations
    corecore