10 research outputs found

    forqs: Forward-in-time Simulation of Recombination, Quantitative Traits, and Selection

    Full text link
    forqs is a forward-in-time simulation of recombination, quantitative traits, and selection. It was designed to investigate haplotype patterns resulting from scenarios where substantial evolutionary change has taken place in a small number of generations due to recombination and/or selection on polygenic quantitative traits. forqs is implemented as a command- line C++ program. Source code and binary executables for Linux, OSX, and Windows are freely available under a permissive BSD license.Comment: preprint include Supplementary Information. https://bitbucket.org/dkessner/forq

    Genet Med

    Get PDF
    Recent dramatic advances in multiomics research coupled with exponentially increasing volume, complexity, and interdisciplinary nature of publications are making it challenging for scientists to stay up-to-date on the literature. Strategies to address this challenge include the creation of online databases and warehouses to support timely and targeted dissemination of research findings. Although most of the early examples have been in cancer genomics and pharmacogenomics, the approaches used can be adapted to support investigators in heart, lung, blood, and sleep (HLBS) disorders research. In this article, we describe the creation of an HLBS population genomics (HLBS-PopOmics) knowledge base as an online, continuously updated, searchable database to support the dissemination and implementation of studies and resources that are relevant to clinical and public health practice. In addition to targeted searches based on the HLBS disease categories, cross-cutting themes reflecting the ethical, legal, and social implications of genomics research; systematic evidence reviews; and clinical practice guidelines supporting screening, detection, evaluation, and treatment are also emphasized in HLBS-PopOmics. Future updates of the knowledge base will include additional emphasis on transcriptomics, proteomics, metabolomics, and other omics research; explore opportunities for leveraging data sets designed to support scientific discovery; and incorporate advanced machine learning bioinformatics capabilities.CC999999/Intramural CDC HHS/United StatesZ99 CA999999/Intramural NIH HHS/United States2019-09-01T00:00:00Z30197419PMC64029527184vault:3161

    EXPLORING POPULATION CHANGE DETECTION BY MONITORING EFFECTIVE NUMBER OF BREEDERS

    Get PDF
    Detecting if a population is in decline is an important objective for biologists and conservationists who are monitoring threatened populations. As genetic methods improve effective population size (Ne) and effective number of breeders (Nb) continue to gain popularity as a way to monitor species. Using simulated populations and linkage disequilibrium, we explored detecting population decline through Nb in age structured populations. Through comparisons of sensitivity (1 – false negatives) and specificity (1- false positives) over 1000 replicates, we explored how factors such as starting Nb, number of SNPs, number of individuals sampled, number of breeding cycles monitored, and rate of decline affected the ability to detect changes in the population. Overall, we found Nb can be an effective metric for detecting population declines, if some care is taken during study design to avoid certain conditions. Although specificity did not vary greatly, sensitivity was much more reactive to changes in the factors tested. Under-sampling of the population (\u3c true Nb), insufficient number of breeding cycles monitored (\u3c7 cycles) and low levels of decline (e.g. \u3c7%), are all detrimental to detection of population change

    Boosting forward-time population genetic simulators through genotype compression

    Get PDF
    Background: Forward-time population genetic simulations play a central role in deriving and testing evolutionary hypotheses. Such simulations may be data-intensive, depending on the settings to the various param- eters controlling them. In particular, for certain settings, the data footprint may quickly exceed the memory of a single compute node. Results: We develop a novel and general method for addressing the memory issue inherent in forward-time simulations by compressing and decompressing, in real-time, active and ancestral genotypes, while carefully accounting for the time overhead. We propose a general graph data structure for compressing the genotype space explored during a simulation run, along with efficient algorithms for constructing and updating compressed genotypes which support both mutation and recombination. We tested the performance of our method in very large-scale simulations. Results show that our method not only scales well, but that it also overcomes memory issues that would cripple existing tools. Conclusions: As evolutionary analyses are being increasingly performed on genomes, pathways, and networks, particularly in the era of systems biology, scaling population genetic simulators to handle large-scale simulations is crucial. We believe our method offers a significant step in that direction. Further, the techniques we provide are generic and can be integrated with existing population genetic simulators to boost their performance in terms of memory usage

    Adaptive Savitzky–Golay Filters for Analysis of Copy Number Variation Peaks from Whole-Exome Sequencing Data

    Get PDF
    Copy number variation (CNV) is a form of structural variation in the human genome that provides medical insight into complex human diseases; while whole-genome sequencing is becoming more affordable, whole-exome sequencing (WES) remains an important tool in clinical diagnostics. Because of its discontinuous nature and unique characteristics of sparse target-enrichment-based WES data, the analysis and detection of CNV peaks remain difficult tasks. The Savitzky–Golay (SG) smoothing is well known as a fast and efficient smoothing method. However, no study has documented the use of this technique for CNV peak detection. It is well known that the effectiveness of the classical SG filter depends on the proper selection of the window length and polynomial degree, which should correspond with the scale of the peak because, in the case of peaks with a high rate of change, the effectiveness of the filter could be restricted. Based on the Savitzky–Golay algorithm, this paper introduces a novel adaptive method to smooth irregular peak distributions. The proposed method ensures high-precision noise reduction by dynamically modifying the results of the prior smoothing to automatically adjust parameters. Our method offers an additional feature extraction technique based on density and Euclidean distance. In comparison to classical Savitzky–Golay filtering and other peer filtering methods, the performance evaluation demonstrates that adaptive Savitzky–Golay filtering performs better. According to experimental results, our method effectively detects CNV peaks across all genomic segments for both short and long tags, with minimal peak height fidelity values (i.e., low estimation bias). As a result, we clearly demonstrate how well the adaptive Savitzky–Golay filtering method works and how its use in the detection of CNV peaks can complement the existing techniques used in CNV peak analysis

    OPENMENDEL: A Cooperative Programming Project for Statistical Genetics

    Full text link
    Statistical methods for genomewide association studies (GWAS) continue to improve. However, the increasing volume and variety of genetic and genomic data make computational speed and ease of data manipulation mandatory in future software. In our view, a collaborative effort of statistical geneticists is required to develop open source software targeted to genetic epidemiology. Our attempt to meet this need is called the OPENMENDELproject (https://openmendel.github.io). It aims to (1) enable interactive and reproducible analyses with informative intermediate results, (2) scale to big data analytics, (3) embrace parallel and distributed computing, (4) adapt to rapid hardware evolution, (5) allow cloud computing, (6) allow integration of varied genetic data types, and (7) foster easy communication between clinicians, geneticists, statisticians, and computer scientists. This article reviews and makes recommendations to the genetic epidemiology community in the context of the OPENMENDEL project.Comment: 16 pages, 2 figures, 2 table

    An evolutionary epigenetic clock in plants

    Get PDF
    Molecular clocks are the basis for dating the divergence between lineages over macroevolutionary timescales (~105 to 108 years). However, classical DNA-based clocks tick too slowly to inform us about the recent past. Here, we demonstrate that stochastic DNA methylation changes at a subset of cytosines in plant genomes display a clocklike behavior. This “epimutation clock” is orders of magnitude faster than DNA-based clocks and enables phylogenetic explorations on a scale of years to centuries. We show experimentally that epimutation clocks recapitulate known topologies and branching times of intraspecies phylogenetic trees in the self-fertilizing plant Arabidopsis thaliana and the clonal seagrass Zostera marina, which represent two major modes of plant reproduction. This discovery will open new possibilities for high-resolution temporal studies of plant biodiversity

    Discovering Higher-order SNP Interactions in High-dimensional Genomic Data

    Get PDF
    In this thesis, a multifactor dimensionality reduction based method on associative classification is employed to identify higher-order SNP interactions for enhancing the understanding of the genetic architecture of complex diseases. Further, this thesis explored the application of deep learning techniques by providing new clues into the interaction analysis. The performance of the deep learning method is maximized by unifying deep neural networks with a random forest for achieving reliable interactions in the presence of noise

    Selection and Metabolic Disease in the Pacific

    Get PDF
    An example of a "signature of selection" in a population is a region of the genome that exhibits a reduction in genetic variability with a particular linkage disequilibrium pattern. This reduction in variation can arise when the phenotype of a neutral beneficial allele experiences a favourable change in environmental conditions. This results in an increased frequency of both the allele, and linked sites, within a population. Polynesian populations share a common genetic ancestry with East Asia, but little characterisation of genetic selection has been undertaken in Polynesian populations. Serum urate has been associated with metabolic disorders such as obesity, type 2 diabetes, renal disease and metabolic syndrome. It is hypothesised that serum urate may have undergone positive selection in Polynesians due to some of the beneficial properties, such as its role as an anti-oxidant, or as an adjuvant for the innate immune system. New Zealand Polynesians have inherently elevated serum urate levels and increased rates of gout. This thesis presents the results of a genome-wide study of selection in Polynesian (and other) populations, focusing on testing the hypothesis that genomic loci containing genes involved in urate processing have undergone selection. There was no evidence of wide-spread selection at genes associated with urate and gout, or related metabolic disorders, but there was evidence at some individual loci. Pathway analysis showed that of the significant pathways, there was a dominance of metabolic pathways that were enriched for genes with signatures of selection. Calcium related transport and signalling was a theme amongst loci that displayed signs of possible selection. Regions of the genome that were possibly selected in modern-day Polynesian populations also had similarities to those of modern-day East Asian populations. This thesis has provided identification and characterisation of regions in the genome with possible evidence of genetic selection in Polynesian populations, that previously was not available. It has also provided insight into the role of genetic selection with respect to urate and metabolic disease
    corecore