4 research outputs found

    SNPredict: A Machine Learning Approach for Detecting Low Frequency Variants in Cancer

    Get PDF
    Cancer is a genetic disease caused by the accumulation of DNA variants such as single nucleotide changes or insertions/deletions in DNA. DNA variants can cause silencing of tumor suppressor genes or increase the activity of oncogenes. In order to come up with successful therapies for cancer patients, these DNA variants need to be identified accurately. DNA variants can be identified by comparing DNA sequence of tumor tissue to a non-tumor tissue by using Next Generation Sequencing (NGS) technology. But the problem of detecting variants in cancer is hard because many of these variant occurs only in a small subpopulation of the tumor tissue. It becomes a challenge to distinguish these low frequency variants from sequencing errors, which are common in today\u27s NGS methods. Several algorithms have been made and implemented as a tool to identify such variants in cancer. However, it has been previously shown that there is low concordance in the results produced by these tools. Moreover, the number of false positives tend to significantly increase when these tools are faced with low frequency variants. This study presents SNPredict, a single nucleotide polymorphism (SNP) detection pipeline that aims to utilize the results of multiple variant callers to produce a consensus output with higher accuracy than any of the individual tool with the help of machine learning techniques. By extracting features from the consensus output that describe traits associated with an individual variant call, it creates binary classifiers that predict a SNP’s true state and therefore help in distinguishing a sequencing error from a true variant

    Extensive de novo mutation rate variation between individuals and across the genome of <i>Chlamydomonas reinhardtii</i>

    Get PDF
    Describing the process of spontaneous mutation is fundamental for understanding the genetic basis of disease, the threat posed by declining population size in conservation biology, and much of evolutionary biology. Directly studying spontaneous mutation has been difficult, however, because new mutations are rare. Mutation accumulation (MA) experiments overcome this by allowing mutations to build up over many generations in the near absence of natural selection. Here, we sequenced the genomes of 85 MA lines derived from six genetically diverse strains of the green alga Chlamydomonas reinhardtii. We identified 6843 new mutations, more than any other study of spontaneous mutation. We observed sevenfold variation in the mutation rate among strains and that mutator genotypes arose, increasing the mutation rate approximately eightfold in some replicates. We also found evidence for fine-scale heterogeneity in the mutation rate, with certain sequence motifs mutating at much higher rates, and clusters of multiple mutations occurring at closely linked sites. There was little evidence, however, for mutation rate heterogeneity between chromosomes or over large genomic regions of 200 kbp. We generated a predictive model of the mutability of sites based on their genomic properties, including local GC content, gene expression level, and local sequence context. Our model accurately predicted the average mutation rate and natural levels of genetic diversity of sites across the genome. Notably, trinucleotides vary 17-fold in rate between the most and least mutable sites. Our results uncover a rich heterogeneity in the process of spontaneous mutation both among individuals and across the genome

    Epigenetic and Genetic Contributions to Adaptation in Chlamydomonas.

    Get PDF
    Epigenetic modifications, such as DNA methylation or histone modifications, can be transmitted between cellular or organismal generations. However, there are no experiments measuring their role in adaptation, so here we use experimental evolution to investigate how epigenetic variation can contribute to adaptation. We manipulated DNA methylation and histone acetylation in the unicellular green alga Chlamydomonas reinhardtii both genetically and chemically to change the amount of epigenetic variation generated or transmitted in adapting populations in three different environments (salt stress, phosphate starvation, and high CO2) for two hundred asexual generations. We find that reducing the amount of epigenetic variation available to populations can reduce adaptation in environments where it otherwise happens. From genomic and epigenomic sequences from a subset of the populations, we see changes in methylation patterns between the evolved populations over-represented in some functional categories of genes, which is consistent with some of these differences being adaptive. Based on whole genome sequencing of evolved clones, the majority of DNA methylation changes do not appear to be linked to cis-acting genetic mutations. Our results show that transgenerational epigenetic effects play a role in adaptive evolution, and suggest that the relationship between changes in methylation patterns and differences in evolutionary outcomes, at least for quantitative traits such as cell division rates, is complex
    corecore