6 research outputs found

    The sequencing and interpretation of the genome obtained from a Serbian individual

    Get PDF
    Recent genetic studies and whole-genome sequencing projects have greatly improved our understanding of human variation and clinically actionable genetic information. Smaller ethnic populations, however, remain underrepresented in both individual and large-scale sequencing efforts and hence present an opportunity to discover new variants of biomedical and demographic significance. This report describes the sequencing and analysis of a genome obtained from an individual of Serbian origin, introducing tens of thousands of previously unknown variants to the currently available pool. Ancestry analysis places this individual in close proximity of the Central and Eastern European populations; i.e., closest to Croatian, Bulgarian and Hungarian individuals and, in terms of other Europeans, furthest from Ashkenazi Jewish, Spanish, Sicilian, and Baltic individuals. Our analysis confirmed gene flow between Neanderthal and ancestral pan-European populations, with similar contributions to the Serbian genome as those observed in other European groups. Finally, to assess the burden of potentially disease-causing/clinically relevant variation in the sequenced genome, we utilized manually curated genotype-phenotype association databases and variant-effect predictors. We identified several variants that have previously been associated with severe early-onset disease that is not evident in the proband, as well as variants that could yet prove to be clinically relevant to the proband over the next decades. The presence of numerous private and low-frequency variants along with the observed and predicted disease-causing mutations in this genome exemplify some of the global challenges of genome interpretation, especially in the context of understudied ethnic groups.Comment: 18 pages, 2 figure

    The sequencing and interpretation of the genome obtained from a Serbian individual

    Get PDF
    Recent genetic studies and whole-genome sequencing projects have greatly improved our understanding of human variation and clinically actionable genetic information. Smaller ethnic populations, however, remain underrepresented in both individual and large-scale sequencing efforts and hence present an opportunity to discover new variants of biomedical and demographic significance. This report describes the sequencing and analysis of a genome obtained from an individual of Serbian origin, introducing tens of thousands of previously unknown variants to the currently available pool. Ancestry analysis places this individual in close proximity to Central and Eastern European populations; i.e., closest to Croatian, Bulgarian and Hungarian individuals and, in terms of other Europeans, furthest from Ashkenazi Jewish, Spanish, Sicilian and Baltic individuals. Our analysis confirmed gene flow between Neanderthal and ancestral pan-European populations, with similar contributions to the Serbian genome as those observed in other European groups. Finally, to assess the burden of potentially disease-causing/clinically relevant variation in the sequenced genome, we utilized manually curated genotype-phenotype association databases and variant-effect predictors. We identified several variants that have previously been associated with severe early-onset disease that is not evident in the proband, as well as putatively impactful variants that could yet prove to be clinically relevant to the proband over the next decades. The presence of numerous private and low-frequency variants, along with the observed and predicted disease-causing mutations in this genome, exemplify some of the global challenges of genome interpretation, especially in the context of under-studied ethnic groups

    A Maximum-Likelihood Approach to Estimating the Insertion Frequencies of Transposable Elements from Population Sequencing Data

    No full text
    Transposable elements (TEs) contribute to a large fraction of the expansion of many eukaryotic genomes due to the capability of TEs duplicating themselves through transposition. A first step to understanding the roles of TEs in a eukaryotic genome is to characterize the population-wide variation of TE insertions in the species. Here, we present a maximum-likelihood (ML) method for estimating allele frequencies and detecting selection on TE insertions in a diploid population, based on the genotypes at TE insertion sites detected in multiple individuals sampled from the population using paired-end (PE) sequencing reads. Tests of the method on simulated data show that it can accurately estimate the allele frequencies of TE insertions even when the PE sequencing is conducted at a relatively low coverage ( = 5X). The method can also detect TE insertions under strong selection, and the detection ability increases with sample size in a population, although a substantial fraction of actual TE insertions under selection may be undetected. Application of the ML method to genomic sequencing data collected from a natural Daphnia pulex population shows that, on the one hand, most ( > 90%) TE insertions present in the reference D. pulex genome are either fixed or nearly fixed (with allele frequencies > 0.95); on the other hand, among the nonreference TE insertions (i.e., those detected in some individuals in the population but absent from the reference genome), the majority ( > 70%) are still at low frequencies ( < 0.1). Finally, we detected a substantial fraction (∼9%) of nonreference TE insertions under selection

    Adaptation of Escherichia\textit Escherichia coli\textit coli to Long-Term Serial Passage in Complex Medium: Evidence of Parallel Evolution.

    No full text
    Experimental evolution of bacterial populations in the laboratory has led to identification of several themes, including parallel evolution of populations adapting to carbon starvation, heat stress, and pH stress. However, most of these experiments study growth in defined and/or constant environments. We hypothesized that while there would likely continue to be parallelism in more complex and changing environments, there would also be more variation in what types of mutations would benefit the cells. In order to test our hypothesis, we serially passaged Escherichia coli in a complex medium (Luria-Bertani broth) throughout the five phases of bacterial growth. This passaging scheme allowed cells to experience a wide variety of stresses, including nutrient limitation, oxidative stress, and pH variation, and therefore allowed them to adapt to several conditions. After every ~30 generations of growth, for a total of ~300 generations, we compared both the growth phenotypes and genotypes of aged populations to the parent population. After as few as 30 generations, populations exhibit changes in growth phenotype and accumulate potentially adaptive mutations. There were many genes with mutant alleles in different populations, indicating potential parallel evolution. We examined 8 of these alleles by constructing the point mutations in the parental genetic background and competed those cells with the parent population; five of these alleles were found to be adaptive. The variety and swiftness of adaptive mutations arising in the populations indicate that the cells are adapting to a complex set of stresses, while the parallel nature of several of the mutations indicates that this behavior may be generalized to bacterial evolution

    Characterization and Optimization of Multiomic Single-Cell Epigenomic Profiling

    No full text
    The snATAC + snRNA platform allows epigenomic profiling of open chromatin and gene expression with single-cell resolution. The most critical assay step is to isolate high-quality nuclei to proceed with droplet-base single nuclei isolation and barcoding. With the increasing popularity of multiomic profiling in various fields, there is a need for optimized and reliable nuclei isolation methods, mainly for human tissue samples. Herein we compared different nuclei isolation methods for cell suspensions, such as peripheral blood mononuclear cells (PBMC, n = 18) and a solid tumor type, ovarian cancer (OC, n = 18), derived from debulking surgery. Nuclei morphology and sequencing output parameters were used to evaluate the quality of preparation. Our results show that NP-40 detergent-based nuclei isolation yields better sequencing results than collagenase tissue dissociation for OC, significantly impacting cell type identification and analysis. Given the utility of applying such techniques to frozen samples, we also tested frozen preparation and digestion (n = 6). A paired comparison between frozen and fresh samples validated the quality of both specimens. Finally, we demonstrate the reproducibility of scRNA and snATAC + snRNA platform, by comparing the gene expression profiling of PBMC. Our results highlight how the choice of nuclei isolation methods is critical for obtaining quality data in multiomic assays. It also shows that the measurement of expression between scRNA and snRNA is comparable and effective for cell type identification
    corecore