70 research outputs found

    Quantifying the mutational process

    Get PDF

    Survival prediction in mesothelioma using a scalable lasso regression model: instructions for use and initial performance using clinical predictors

    Get PDF
    Introduction: Accurate prognostication is difficult in malignant pleural mesothelioma (MPM). We developed a set of robust computational models to quantify the prognostic value of routinely available clinical data, which form the basis of published MPM prognostic models. Methods: Data regarding 269 patients with MPM were allocated to balanced training (n=169) and validation sets (n=100). Prognostic signatures (minimal length best performing multivariate trained models) were generated by least absolute shrinkage and selection operator regression for overall survival (OS), OS <6 months and OS <12 months. OS prediction was quantified using Somers DXY statistic, which varies from 0 to 1, with increasing concordance between observed and predicted outcomes. 6-month survival and 12-month survival were described by area under the curve (AUC) scores. Results: Median OS was 270 (IQR 140–450) days. The primary OS model assigned high weights to four predictors: age, performance status, white cell count and serum albumin, and after cross-validation performed significantly better than would be expected by chance (mean DXY0.332 (±0.019)). However, validation set DXY was only 0.221 (0.0935–0.346), equating to a 22% improvement in survival prediction than would be expected by chance. The 6-month and 12-month OS signatures included the same four predictors, in addition to epithelioid histology plus platelets and epithelioid histology plus C-reactive protein (mean AUC 0.758 (±0.022) and 0.737 (±0.012), respectively). The <6-month OS model demonstrated 74% sensitivity and 68% specificity. The <12-month OS model demonstrated 63% sensitivity and 79% specificity. Model content and performance were generally comparable with previous studies. Conclusions: The prognostic value of the basic clinical information contained in these, and previously published models, is fundamentally of limited value in accurately predicting MPM prognosis. The methods described are suitable for expansion using emerging predictors, including tumour genomics and volumetric staging

    Patterns of intron sequence evolution in Drosophila are dependent upon length and GC content

    Get PDF
    BACKGROUND: Introns comprise a large fraction of eukaryotic genomes, yet little is known about their functional significance. Regulatory elements have been mapped to some introns, though these are believed to account for only a small fraction of genome wide intronic DNA. No consistent patterns have emerged from studies that have investigated general levels of evolutionary constraint in introns. RESULTS: We examine the relationship between intron length and levels of evolutionary constraint by analyzing inter-specific divergence at 225 intron fragments in Drosophila melanogaster and Drosophila simulans, sampled from a broad distribution of intron lengths. We document a strongly negative correlation between intron length and divergence. Interestingly, we also find that divergence in introns is negatively correlated with GC content. This relationship does not account for the correlation between intron length and divergence, however, and may simply reflect local variation in mutational rates or biases. CONCLUSION: Short introns make up only a small fraction of total intronic DNA in the genome. Our finding that long introns evolve more slowly than average implies that, while the majority of introns in the Drosophila genome may experience little or no selective constraint, most intronic DNA in the genome is likely to be evolving under considerable constraint. Our results suggest that functional elements may be ubiquitous within longer introns and that these introns may have a more general role in regulating gene expression than previously appreciated. Our finding that GC content and divergence are negatively correlated in introns has important implications for the interpretation of the correlation between divergence and levels of codon bias observed in Drosophila

    Reduced efficacy of selection in regions of the Drosophila genome that lack crossing over

    Get PDF
    BACKGROUND: The recombinational environment is predicted to influence patterns of protein sequence evolution through the effects of Hill-Robertson interference among linked sites subject to selection. In freely recombining regions of the genome, selection should more effectively incorporate new beneficial mutations, and eliminate deleterious ones, than in regions with low rates of genetic recombination. RESULTS: We examined the effects of recombinational environment on patterns of evolution using a genome-wide comparison of Drosophila melanogaster and D. yakuba. In regions of the genome with no crossing over, we find elevated divergence at nonsynonymous sites and in long introns, a virtual absence of codon usage bias, and an increase in gene length. However, we find little evidence for differences in patterns of evolution between regions with high, intermediate, and low crossover frequencies. In addition, genes on the fourth chromosome exhibit more extreme deviations from regions with crossing over than do other, no crossover genes outside the fourth chromosome. CONCLUSION: All of the patterns observed are consistent with a severe reduction in the efficacy of selection in the absence of crossing over, resulting in the accumulation of deleterious mutations in these regions. Our results also suggest that even a very low frequency of crossing over may be enough to maintain the efficacy of selection

    A multi-tissue age prediction model based on DNA methylation analysis

    Get PDF
    Age related tissue-specific DNA methylation markers have been identified in many studies, which can be used to estimate the chronological age of an unknown biological sample’s donor. However, if these markers have been used on the wrong type of tissue, they will give an inaccurate age estimation. This research has therefore examined HumanMethylation450 (HM450) BeadChip-based profiles retrieved from the NCBI repository, with the aim of identifying a set of universal DNA methylation markers across forensically relevant tissues. By using elastic net regression, it was possible to identify 10 age-related (AR) DNA methylation markers across 41 samples coming from five types of tissue (whole blood, saliva, semen, menstrual blood, and vaginal secretions). The average predictive accuracy of the constructed model based on training data is 3.8 years. In an independent dataset of 24 samples from four types of tissues (blood, saliva, menstrual blood, and vaginal secretions), the mean absolute deviation for the menstrual blood and vaginal fluid is 6.9 years, 5.6 years for buccal swabs, and 7.8 years for blood. The overall multi-tissue accuracy rate based on bootstrap analysis was 7.8 years (95% Confidence Interval 6–9.7 years). The identified multi-tissue age prediction model has the potential to assist forensic investigations without the requirement to identify the sample body fluid type

    Evidence for Pervasive Adaptive Protein Evolution in Wild Mice

    Get PDF
    The relative contributions of neutral and adaptive substitutions to molecular evolution has been one of the most controversial issues in evolutionary biology for more than 40 years. The analysis of within-species nucleotide polymorphism and between-species divergence data supports a widespread role for adaptive protein evolution in certain taxa. For example, estimates of the proportion of adaptive amino acid substitutions (alpha) are 50% or more in enteric bacteria and Drosophila. In contrast, recent estimates of alpha for hominids have been at most 13%. Here, we estimate alpha for protein sequences of murid rodents based on nucleotide polymorphism data from multiple genes in a population of the house mouse subspecies Mus musculus castaneus, which inhabits the ancestral range of the Mus species complex and nucleotide divergence between M. m. castaneus and M. famulus or the rat. We estimate that 57% of amino acid substitutions in murids have been driven by positive selection. Hominids, therefore, are exceptional in having low apparent levels of adaptive protein evolution. The high frequency of adaptive amino acid substitutions in wild mice is consistent with their large effective population size, leading to effective natural selection at the molecular level. Effective natural selection also manifests itself as a paucity of effectively neutral nonsynonymous mutations in M. m. castaneus compared to humans

    Recent evolution in Rattus norvegicus is shaped by declining effective population size

    Get PDF
    The brown rat, Rattus norvegicus, is both a notorious pest and a frequently used model in biomedical research. By analyzing genome sequences of 12 wild-caught brown rats from their presumed ancestral range in NE China, along with the sequence of a black rat, Rattus rattus, we investigate the selective and demographic forces shaping variation in the genome. We estimate that the recent effective population size (N(e)) of this species = [Formula: see text] , based on silent site diversity. We compare patterns of diversity in these genomes with patterns in multiple genome sequences of the house mouse (Mus musculus castaneus), which has a much larger N(e). This reveals an important role for variation in the strength of genetic drift in mammalian genome evolution. By a Pairwise Sequentially Markovian Coalescent analysis of demographic history, we infer that there has been a recent population size bottleneck in wild rats, which we date to approximately 20,000 years ago. Consistent with this, wild rat populations have experienced an increased flux of mildly deleterious mutations, which segregate at higher frequencies in protein-coding genes and conserved noncoding elements. This leads to negative estimates of the rate of adaptive evolution (α) in proteins and conserved noncoding elements, a result which we discuss in relation to the strongly positive estimates observed in wild house mice. As a consequence of the population bottleneck, wild rats also show a markedly slower decay of linkage disequilibrium with physical distance than wild house mice

    Assessing Recent Selection and Functionality at Long Non-Coding RNA Loci in the Mouse Genome

    Get PDF
    This work was supported by the Biotechnology and Biological Sciences Research Council and The Wellcome Trust. A.N. was supported by the Swiss National Science Foundation (Grant: PZ00P3_142636). H.K. was supported by the European Research Council Starting (Grant: 242597, SexGenTransEvolution) and the Swiss National Science Foundation (Grants: 130287 and 146474).Long noncoding RNAs (lncRNAs) are one of the most intensively studied groups of noncoding elements. Debate continues over what proportion of lncRNAs are functional or merely represent transcriptional noise. Although characterization of individual lncRNAs has identified approximately 200 functional loci across the Eukarya, general surveys have found only modest or no evidence of long-term evolutionary conservation. Although this lack of conservation suggests that most lncRNAs are nonfunctional, the possibility remains that some represent recent evolutionary innovations. We examine recent selection pressures acting on lncRNAs in mouse populations. We compare patterns of within-species nucleotide variation at approximately 10,000 lncRNA loci in a cohort of the wild house mouse, Mus musculus castaneus, with between-species nucleotide divergence from the rat (Rattus norvegicus). Loci under selective constraint are expected to show reduced nucleotide diversity and divergence. We find limited evidence of sequence conservation compared with putatively neutrally evolving ancestral repeats (ARs). Comparisons of sequence diversity and divergence between ARs, protein-coding (PC) exons and lncRNAs, and the associated flanking regions, show weak, but significantly lower levels of sequence diversity and divergence at lncRNAs compared with ARs. lncRNAs conserved deep in the vertebrate phylogeny show lower within-species sequence diversity than lncRNAs in general. A set of 74 functionally characterized lncRNAs show levels of diversity and divergence comparable to PC exons, suggesting that these lncRNAs are under substantial selective constraints. Our results suggest that, in mouse populations, most lncRNA loci evolve at rates similar to ARs, whereas older lncRNAs tend to show signals of selection similar to PC genes.PostprintPeer reviewe

    Chemical Proteomics-Based Analysis of Off-target Binding Profiles for Rosiglitazone and Pioglitazone: Clues for Assessing Potential for Cardiotoxicity

    Get PDF
    Drugs exert desired and undesired effects based on their binding interactions with protein target(s) and off-target(s), providing evidence for drug efficacy and toxicity. Pioglitazone and rosiglitazone possess a common functional core, glitazone, which is considered a privileged scaffold upon which to build a drug selective for a given target—in this case, PPARγ. Herein, we report a retrospective analysis of two variants of the glitazone scaffold, pioglitazone and rosiglitazone, in an effort to identify off-target binding events in the rat heart to explain recently reported cardiovascular risk associated with these drugs. Our results suggest that glitazone has affinity for dehydrogenases, consistent with known binding preferences for related rhodanine cores. Both drugs bound ion channels and modulators, with implications in congestive heart failure, arrhythmia, and peripheral edema. Additional proteins involved in glucose homeostasis, synaptic transduction, and mitochondrial energy production were detected and potentially contribute to drug efficacy and cardiotoxicity
    corecore