45 research outputs found

    Comprehensive analysis of the pseudogenes of glycolytic enzymes in vertebrates: the anomalously high number of GAPDH pseudogenes highlights a recent burst of retrotrans-positional activity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Pseudogenes provide a record of the molecular evolution of genes. As glycolysis is such a highly conserved and fundamental metabolic pathway, the pseudogenes of glycolytic enzymes comprise a standardized genomic measuring stick and an ideal platform for studying molecular evolution. One of the glycolytic enzymes, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), has already been noted to have one of the largest numbers of associated pseudogenes, among all proteins.</p> <p>Results</p> <p>We assembled the first comprehensive catalog of the processed and duplicated pseudogenes of glycolytic enzymes in many vertebrate model-organism genomes, including human, chimpanzee, mouse, rat, chicken, zebrafish, pufferfish, fruitfly, and worm (available at <url>http://pseudogene.org/glycolysis/</url>). We found that glycolytic pseudogenes are predominantly processed, i.e. retrotransposed from the mRNA of their parent genes. Although each glycolytic enzyme plays a unique role, GAPDH has by far the most pseudogenes, perhaps reflecting its large number of non-glycolytic functions or its possession of a particularly retrotranspositionally active sub-sequence. Furthermore, the number of GAPDH pseudogenes varies significantly among the genomes we studied: none in zebrafish, pufferfish, fruitfly, and worm, 1 in chicken, 50 in chimpanzee, 62 in human, 331 in mouse, and 364 in rat. Next, we developed a simple method of identifying conserved syntenic blocks (consistently applicable to the wide range of organisms in the study) by using orthologous genes as anchors delimiting a conserved block between a pair of genomes. This approach showed that few glycolytic pseudogenes are shared between primate and rodent lineages. Finally, by estimating pseudogene ages using Kimura's two-parameter model of nucleotide substitution, we found evidence for bursts of retrotranspositional activity approximately 42, 36, and 26 million years ago in the human, mouse, and rat lineages, respectively.</p> <p>Conclusion</p> <p>Overall, we performed a consistent analysis of one group of pseudogenes across multiple genomes, finding evidence that most of them were created within the last 50 million years, subsequent to the divergence of rodent and primate lineages.</p

    Applying novel tree-based frameworks to big data for classification of heart failure patients and prediction of clinical responses

    Get PDF
    Over 5 million Americans suffer from heart failure, a condition with a 5-year survival that eclipses all cancers apart from that of lung cancer. Conventional understanding of heart failure is simplistic: it is viewed as a single syndrome, despite real heterogeneity. In addition, models predicting outcomes focus on dichotomous results, like 30-day readmission. A novel approach to classification of heart failure may improve our ability to target interventions, improve patient experiences, and predict outcomes. The Healthcare Cost and Utilization Project is a family of administrative claims databases that describes patient demographics, comorbidities, procedures, acute care utilization and outcomes, such as mortality and readmission. Using the California datasets, which allow linkage of hospital admissions to emergency department visits, we sought to (1) develop a new classification tool for heart failure, (2) predict patient response based on previous visits, (3) predict survival time. In this pilot study, we propose novel tree-based frameworks for the classification of heart failure patients that can also be used to predict clinical response, health care utilization and mortality. The pilot sample contains 822 patients with heart failure who are randomly picked from a total sample of 211284 patients. The median number of encounters per patient was 3 (IQR: 5); each are associated with up to 168 variables. By applying random forest approaches to this pilot sample, we have performed classification of patients with heart failure and identified important predictors of outcomes. Going forward, we will refine the model and apply to the entire data set to produce broadly applicable insights

    Seroprevalence of Strongyloides stercoralis infection in a South Indian adult population

    Get PDF
    BACKGROUND: The prevalence of Strongyloides stercoralis infection is estimated to be 30–100 million worldwide, although this an underestimate. Most cases remain undiagnosed due to the asymptomatic nature of the infection. We wanted to estimate the seroprevalence of S. stercoralis infection in a South Indian adult population. METHODS: To this end, we performed community-based screening of 2351 individuals (aged 18–65) in Kanchipuram District of Tamil Nadu between 2013 and 2020. Serological testing for S. stercoralis was performed using the NIE ELISA. RESULTS: Our data shows a seroprevalence of 33% (768/2351) for S. stercoralis infection which had a higher prevalence among males 36% (386/1069) than among females 29.8% (382/1282). Adults aged ≥55 (aOR = 1.65, 95% CI: 1.25–2.18) showed higher adjusted odds of association compared with other age groups. Eosinophil levels (39%) (aOR = 1.43, 95% CI: 1.19–1.74) and hemoglobin levels (24%) (aOR = 1.25, 95% CI: 1.11–1.53) were significantly associated with S. stercoralis infection. In contrast, low BMI (aOR = 1.15, 95% CI: 0.82–1.61) or the presence of diabetes mellitus (OR = 1.18, 95% CI: 0.83–1.69) was not associated with S. stercoralis seropositivity. CONCLUSIONS: Our study provides evidence for a very high baseline prevalence of S. stercoralis infection in South Indian communities and this information could provide realistic and concrete planning of control measures

    Enhanced Transcriptome Maps from Multiple Mouse Tissues Reveal Evolutionary Constraint in Gene Expression for Thousands of Genes

    Get PDF
    We characterized by RNA-seq the transcriptional profiles of a large and heterogeneous collection of mouse tissues, augmenting the mouse transcriptome with thousands of novel transcript candidates. Comparison with transcriptome profiles obtained in human cell lines reveals substantial conservation of transcriptional programs, and uncovers a distinct class of genes with levels of expression across cell types and species, that have been constrained early in vertebrate evolution. This core set of genes capture a substantial and constant fraction of the transcriptional output of mammalian cells, and participates in basic functional and structural housekeeping processes common to all cell types. Perturbation of these constrained genes is associated with significant phenotypes including embryonic lethality and cancer. Evolutionary constraint in gene expression levels is not reflected in the conservation of the genomic sequences, but is associated with strong and conserved epigenetic marking, as well as to a characteristic post-transcriptional regulatory program in which sub-cellular localization and alternative splicing play comparatively large roles

    Design and implementation of the international genetics and translational research in transplantation network

    Get PDF

    Genome-wide association and Mendelian randomisation analysis provide insights into the pathogenesis of heart failure

    Get PDF
    Heart failure (HF) is a leading cause of morbidity and mortality worldwide. A small proportion of HF cases are attributable to monogenic cardiomyopathies and existing genome-wide association studies (GWAS) have yielded only limited insights, leaving the observed heritability of HF largely unexplained. We report results from a GWAS meta-analysis of HF comprising 47,309 cases and 930,014 controls. Twelve independent variants at 11 genomic loci are associated with HF, all of which demonstrate one or more associations with coronary artery disease (CAD), atrial fibrillation, or reduced left ventricular function, suggesting shared genetic aetiology. Functional analysis of non-CAD-associated loci implicate genes involved in cardiac development (MYOZ1, SYNPO2L), protein homoeostasis (BAG3), and cellular senescence (CDKN1A). Mendelian randomisation analysis supports causal roles for several HF risk factors, and demonstrates CAD-independent effects for atrial fibrillation, body mass index, and hypertension. These findings extend our knowledge of the pathways underlying HF and may inform new therapeutic strategies

    Concept and design of a genome-wide association genotyping array tailored for transplantation-specific studies

    Get PDF
    Background: In addition to HLA genetic incompatibility, non-HLA difference between donor and recipients of transplantation leading to allograft rejection are now becoming evident. We aimed to create a unique genome-wide platform to facilitate genomic research studies in transplant-related studies. We designed a genome-wide genotyping tool based on the most recent human genomic reference datasets, and included customization for known and potentially relevant metabolic and pharmacological loci relevant to transplantation. Methods: We describe here the design and implementation of a customized genome-wide genotyping array, the ‘TxArray’, comprising approximately 782,000 markers with tailored content for deeper capture of variants across HLA, KIR, pharmacogenomic, and metabolic loci important in transplantation. To test concordance and genotyping quality, we genotyped 85 HapMap samples on the array, including eight trios. Results: We show low Mendelian error rates and high concordance rates for HapMap samples (average parent-parent-child heritability of 0.997, and concordance of 0.996). We performed genotype imputation across autosomal regions, masking directly genotyped SNPs to assess imputation accuracy and report an accuracy of >0.962 for directly genotyped SNPs. We demonstrate much higher capture of the natural killer cell immunoglobulin-like receptor (KIR) region versus comparable platforms. Overall, we show that the genotyping quality and coverage of the TxArray is very high when compared to reference samples and to other genome-wide genotyping platforms. Conclusions: We have designed a comprehensive genome-wide genotyping tool which enables accurate association testing and imputation of ungenotyped SNPs, facilitating powerful and cost-effective large-scale genotyping of transplant-related studies. Electronic supplementary material The online version of this article (doi:10.1186/s13073-015-0211-x) contains supplementary material, which is available to authorized users
    corecore