478 research outputs found

    Population genomics reveals that within-fungus polymorphism is common and maintained in populations of the mycorrhizal fungus Rhizophagus irregularis.

    Get PDF
    Arbuscular mycorrhizal (AM) fungi are symbionts of most plants, increasing plant growth and diversity. The model AM fungus Rhizophagus irregularis (isolate DAOM 197198) exhibits low within-fungus polymorphism. In contrast, another study reported high within-fungus variability. Experiments with other R. irregularis isolates suggest that within-fungus genetic variation can affect the fungal phenotype and plant growth, highlighting the biological importance of such variation. We investigated whether there is evidence of differing levels of within-fungus polymorphism in an R. irregularis population. We genotyped 20 isolates using restriction site-associated DNA sequencing and developed novel approaches for characterizing polymorphism among haploid nuclei. All isolates exhibited higher within-isolate poly-allelic single-nucleotide polymorphism (SNP) densities than DAOM 197198 in repeated and non-repeated sites mapped to the reference genome. Poly-allelic SNPs were independently confirmed. Allele frequencies within isolates deviated from diploids or tetraploids, or that expected for a strict dikaryote. Phylogeny based on poly-allelic sites was robust and mirrored the standard phylogeny. This indicates that within-fungus genetic variation is maintained in AM fungal populations. Our results predict a heterokaryotic state in the population, considerable differences in copy number variation among isolates and divergence among the copies, or aneuploidy in some isolates. The variation may be a combination of all of these hypotheses. Within-isolate genetic variation in R. irregularis leads to large differences in plant growth. Therefore, characterizing genomic variation within AM fungal populations is of major ecological importance

    Parsimony-based genetic algorithm for haplotype resolution and block partitioning

    Get PDF
    This dissertation proposes a new algorithm for performing simultaneous haplotype resolution and block partitioning. The algorithm is based on genetic algorithm approach and the parsimonious principle. The multiloculs LD measure (Normalized Entropy Difference) is used as a block identification criterion. The proposed algorithm incorporates missing data is a part of the model and allows blocks of arbitrary length. In addition, the algorithm provides scores for the block boundaries which represent measures of strength of the boundaries at specific positions. The performance of the proposed algorithm was validated by running it on several publicly available data sets including the HapMap data and comparing results to those of the existing state-of-the-art algorithms. The results show that the proposed genetic algorithm provides the accuracy of haplotype decomposition within the range of the same indicators shown by the other algorithms. The block structure output by our algorithm in general agrees with the block structure for the same data provided by the other algorithms. Thus, the proposed algorithm can be successfully used for block partitioning and haplotype phasing while providing some new valuable features like scores for block boundaries and fully incorporated treatment of missing data. In addition, the proposed algorithm for haplotyping and block partitioning is used in development of the new clustering algorithm for two-population mixed genotype samples. The proposed clustering algorithm extracts from the given genotype sample two clusters with substantially different block structures and finds haplotype resolution and block partitioning for each cluster

    The roles of history, geography, and environment in shaping landscape genetic variation and its applied significance

    Get PDF
    The decline and loss of species and genetic diversity as a result of anthropogenic change is occurring at an unprecedented rate, reshaping biodiversity and restructuring ecosystems. Population genetic variation is shaped by evolutionary processes and in turn determines the evolutionary potential of natural populations. Facilitated by recent improvements in DNA sequencing technologies, population genomic analyses can resolve patterns of genetic differentiation and evolutionary history, characterize the effects of evolutionary process on genome variation, and facilitate an understanding of how response to environmental variation may underlie local adaptation. Such analyses can inform conservation and restoration by establishing baseline patterns of genetic variation across the landscape, recognizing evolutionary significant units, sourcing propagules for restoration, and predicting species response to changing environmental conditions. Here, I applied high throughput DNA sequencing approaches to characterize the historical, spatial, and environmental factors shaping genetic variation in several systems of conservation and restoration significance. First, I investigated hierarchical genetic structure and evolutionary history of Hucho taimen (taimen, the world’s largest salmonid), listed as vulnerable by the International Union for Conservation of Nature (IUCN), across multiple river basins in Russia and Mongolia. Second, I characterized patterns of emergent population genetic structure of nonnative Oncorhynchus mykiss (rainbow trout) in the Lake Tahoe basin to inform reintroduction of the U.S. Endangered Species Act listed native cutthroat trout Oncorhynchus clarkii henshawi (Lahontan cutthroat trout). Rainbow trout have been widely introduced across the globe, stocked for >50 years into Lake Tahoe, and an understanding of population genetic structure may help inform strategies for successful native species reintroduction. Finally, I quantified spatial genetic structure, identified environmental variables potentially involved in local adaptation, and predicted variation in maladaptation under projected climate change across the range of Pinus muricata, a closed-cone pine occurring in a small number of isolated and disjunct stands along the coast of California, and also listed as vulnerable by the IUCN. Collectively, my research highlights the wide utility of population genomic analyses for taxa of conservation and restoration significance

    Migration without interbreeding: Evolutionary history of a highly selfing Mediterranean grass inferred from whole genomes

    Full text link
    Wild plant populations show extensive genetic subdivision and are far from the ideal of panmixia which permeates population genetic theory. Understanding the spatial and temporal scale of population structure is therefore fundamental for empirical population genetics –and of interest in itself, as it yields insights into the history and biology of a species. In this study we extend the genomic resources for the wild Mediterranean grass Brachypodium distachyon to investigate the scale of population structure and its underlying history at whole-genome resolution. A total of 86 accessions were sampled at local and regional scales in Italy and France, which closes a conspicuous gap in the collection for this model organism. The analysis of 196 accessions, spanning the Mediterranean from Spain to Iraq, suggests that the interplay of high selfing and seed dispersal rates has shaped genetic structure in B. distachyon. At the continental scale, the evolution in B. distachyon is characterized by the independent expansion of three lineages during the Upper Pleistocene. Today, these lineages may occur on the same meadow yet do not interbreed. At the regional scale, dispersal and selfing interact and maintain high genotypic diversity, thus challenging the textbook notion that selfing in finite populations implies reduced diversity. Our study extends the population genomic resources for B. distachyon and suggests that an important use of this wild plant model is to investigate how selfing and dispersal, two processes typically studied separately, interact in colonizing plant species

    Visualization of Pairwise and Multilocus Linkage Disequilibrium Structure Using Latent Forests

    Get PDF
    Linkage disequilibrium study represents a major issue in statistical genetics as it plays a fundamental role in gene mapping and helps us to learn more about human history. The linkage disequilibrium complex structure makes its exploratory data analysis essential yet challenging. Visualization methods, such as the triangular heat map implemented in Haploview, provide simple and useful tools to help understand complex genetic patterns, but remain insufficient to fully describe them. Probabilistic graphical models have been widely recognized as a powerful formalism allowing a concise and accurate modeling of dependences between variables. In this paper, we propose a method for short-range, long-range and chromosome-wide linkage disequilibrium visualization using forests of hierarchical latent class models. Thanks to its hierarchical nature, our method is shown to provide a compact view of both pairwise and multilocus linkage disequilibrium spatial structures for the geneticist. Besides, a multilocus linkage disequilibrium measure has been designed to evaluate linkage disequilibrium in hierarchy clusters. To learn the proposed model, a new scalable algorithm is presented. It constrains the dependence scope, relying on physical positions, and is able to deal with more than one hundred thousand single nucleotide polymorphisms. The proposed algorithm is fast and does not require phase genotypic data

    Biogeography in the deep : hierarchical population genomic structure of two beaked whale species

    Get PDF
    Funding for this research was provided by the Office of Naval Research, Award numbers N000141613017 and N000142112712. ABO was supported by a partial studentship from the University of St Andrews, School of Biology; OEG by the Marine Alliance for Science and Technology for Scotland (Scottish Funding Council grant HR09011); ELC by a Rutherford Discovery Fellowship from the Royal Society of New Zealand Te Aparangi; NAS by a Ramon y Cajal Fellowship from the Spanish Ministry of Innovation; MLM by the European Union’s Horizon 2020 Research and Innovation Programme (Marie SkƂodowska-Curie grant 801199); CR by the Marine Institute (Cetaceans on the Frontier) and the Irish Research Council; and MTO by the Hartmann Foundation.The deep sea is the largest ecosystem on Earth, yet little is known about the processes driving patterns of genetic diversity in its inhabitants. Here, we investigated the macro- and microevolutionary processes shaping genomic population structure and diversity in two poorly understood, globally distributed, deep-sea predators: Cuvier’s beaked whale (Ziphius cavirostris) and Blainville’s beaked whale (Mesoplodon densirostris). We used double-digest restriction associated DNA (ddRAD) and whole mitochondrial genome (mitogenome) sequencing to characterise genetic patterns using phylogenetic trees, cluster analysis, isolation-by-distance, genetic diversity and differentiation statistics. Single nucleotide polymorphisms (SNPs; Blainville’s n = 43 samples, SNPs=13988; Cuvier’s n = 123, SNPs= 30479) and mitogenomes (Blainville’s n = 27; Cuvier’s n = 35) revealed substantial hierarchical structure at a global scale. Both species display significant genetic structure between the Atlantic, Indo-Pacific and in Cuvier’s, the Mediterranean Sea. Within major ocean basins, clear differentiation is found between genetic clusters on the east and west sides of the North Atlantic, and some distinct patterns of structure in the Indo-Pacific and Southern Hemisphere. We infer that macroevolutionary processes shaping patterns of genetic diversity include biogeographical barriers, highlighting the importance of such barriers even to highly mobile, deep-diving taxa. The barriers likely differ between the species due to their thermal tolerances and evolutionary histories. On a microevolutionary scale, it seems likely that the balance between resident populations displaying site fidelity, and transient individuals facilitating gene flow, shapes patterns of connectivity and genetic drift in beaked whales. Based on these results, we propose management units to facilitate improved conservation measures for these elusive species.Publisher PDFPeer reviewe

    The influence of history, geography, and environment on patterns of diversification in garter snakes (Thamnophis)

    Get PDF
    A major goal of biology is to determine how and why diversity is generated and maintained–from subtle genetic variation between populations of the same species, to ecological differences between closely related species, to phenotypic divergence across deep phylogenetic lineages. Two key aspects in the evolution of biological diversity are space and time. When populations become physically isolated for long periods of time, evolutionary forces uniquely alter those populations and set them on distinct evolutionary trajectories. The spatial structure of genetic differentiation is greatly influenced by variation in evolutionary forces stemming from heterogeneous landscapes. Isolation among populations is largely mediated by physical features or ecological variation, which can fragment populations and allow for local adaptation to divergent ecologies. Because landscapes are dynamic and ever shifting, evolutionary forces such as gene flow, selection, and genetic drift continuously shape and reshape the spatial patterning of genetic variation and adaptive traits across a continuum of spatial and temporal scales. My dissertation research investigates these mechanisms by looking at patterns of phenotypic and genetic diversification at differing spatial and temporal scales in garter snakes–spanning entire clades of animals that have diversified across broad areas over long time periods, to fine-scale patterns of differentiation among populations of the same species. This largely focused on contemporary and historic biogeographic features and ecological influences on shaping genetic and phenotypic variation in garter snakes (Thamnophis). My first project investigates how biogeography and feeding ecology have shaped lineage diversification and morphological evolution across all of Thamnophis by reassessing phylogenetic relationships. My second project investigates how historical biogeography and environmental variation influence patterns of genetic diversity among and within three subspecies of T. elegans. My final project investigates how historical divergence and spatial genetic structure of populations underly geographic variation in an adaptive phenotype in T. atratus. For each of my projects, I used reduced representation double-digest restriction associated DNA sequencing (ddRADseq). This large-scale dataset was used to quantify spatial genetic variation, characterize population genetic structure, and estimate phylogenetic relationships of lineages of garter snakes

    Efficient algorithms in analyzing genomic data

    Get PDF
    With the development of high-throughput and low-cost genotyping technologies, immense data can be cheaply and efficiently produced for various genetic studies. A typical dataset may contain hundreds of samples with millions of genotypes/haplotypes. In order to prevent data analysis from becoming a bottleneck, there is an evident need for fast and efficient analysis methods. My thesis focuses on two interesting and important genetic analyzing problems. Genome-wide Association mapping. The goal of genome wide association mapping is to identify genes or narrow regions in the genome which have significant statistical correlations to the given phenotypes. The discovery of these genes offers the potential for increased understanding of biological processes affecting phenotypes such as body weight and blood pressure. Sample selection for maximal Genetic Diversity. Given a large set of samples, it is usually more efficient to first conduct experiments on a small subset. Then the following question arises: What subset to use? There are many experimental scenarios where the ultimate objective is to maintain, or at least maximize, the genetic diversity within relatively small breeding populations. In my thesis, I developed the following efficient and effective algorithms to address these problems. Phylogeny-based Genom-wide association mapping: TreeQA: The algorithm uses local perfect phylogeny tree in genome wide analysis for genotype/phenotype association mapping. Samples are partitioned according to the sub-trees they belong to. The association between a tree and the phenotype is measured by some statistic tests. TreeQA+: TreeQA+ inherits all the advantages of TreeQA. Moreover, it improves TreeQA by incorporating sample correlations into the association study. Sample selection for maximal genetic diversity: Sample Selection in biallelic SNP Data: Samples are selected based on their genetic diversity among a set of SNPs. Given a set of samples, the algorithms search for the minimum subset that retains all diversity (or a high percentage of diversity). Representative Sample Selection in Non-Biallelic Data: For more general data (non-biallelic), information-theoretic measurements such as entropy and mutual information are used to measure the diversity of a sample subset. Samples are selected to maximize the original information retained

    Development of a targeted amplicon sequencing method for genotyping Cyclospora cayetanensis from fresh produce and clinical samples with enhanced genomic resolution and sensitivity

    Get PDF
    Outbreaks of cyclosporiasis, an enteric illness caused by the parasite Cyclospora cayetanensis, have been associated with consumption of various types of fresh produce. Although a method is in use for genotyping C. cayetanensis from clinical specimens, the very low abundance of C. cayetanensis in food and environmental samples presents a greater challenge. To complement epidemiological investigations, a molecular surveillance tool is needed for use in genetic linkage of food vehicles to cyclosporiasis illnesses, estimation of the scope of outbreaks or clusters of illness, and determination of geographical areas involved. We developed a targeted amplicon sequencing (TAS) assay that incorporates a further enrichment step to gain the requisite sensitivity for genotyping C. cayetanensis contaminating fresh produce samples. The TAS assay targets 52 loci, 49 of which are located in the nuclear genome, and encompasses 396 currently known SNP sites. The performance of the TAS assay was evaluated using lettuce, basil, cilantro, salad mix, and blackberries inoculated with C. cayetanensis oocysts. A minimum of 24 markers were haplotyped even at low contamination levels of 10 oocysts in 25 g leafy greens. The artificially contaminated fresh produce samples were included in a genetic distance analysis based on haplotype presence/absence with publicly available C. cayetanensis whole genome sequence assemblies. Oocysts from two different sources were used for inoculation, and samples receiving the same oocyst preparation clustered together, but separately from the other group, demonstrating the utility of the assay for genetically linking samples. Clinical fecal samples with low parasite loads were also successfully genotyped. This work represents a significant advance in the ability to genotype C. cayetanensis contaminating fresh produce along with greatly expanding the genomic diversity included for genetic clustering of clinical specimens
    • 

    corecore