119 research outputs found

    Aquaporins in the wild: natural genetic diversity and selective pressure in the PIP gene family in five Neotropical tree species

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Tropical trees undergo severe stress through seasonal drought and flooding, and the ability of these species to respond may be a major factor in their survival in tropical ecosystems, particularly in relation to global climate change. Aquaporins are involved in the regulation of water flow and have been shown to be involved in drought response; they may therefore play a major adaptive role in these species. We describe genetic diversity in the PIP sub-family of the widespread gene family of Aquaporins in five Neotropical tree species covering four botanical families.</p> <p>Results</p> <p>PIP Aquaporin subfamily genes were isolated, and their DNA sequence polymorphisms characterised in natural populations. Sequence data were analysed with statistical tests of standard neutral equilibrium and demographic scenarios simulated to compare with the observed results. Chloroplast SSRs were also used to test demographic transitions. Most gene fragments are highly polymorphic and display signatures of balancing selection or bottlenecks; chloroplast SSR markers have significant statistics that do not conform to expectations for population bottlenecks. Although not incompatible with a purely demographic scenario, the combination of all tests tends to favour a selective interpretation of extant gene diversity.</p> <p>Conclusions</p> <p>Tropical tree PIP genes may generally undergo balancing selection, which may maintain high levels of genetic diversity at these loci. Genetic variation at PIP genes may represent a response to variable environmental conditions.</p

    The effects of sample size on population genomic analyses – implications for the tests of neutrality

    Get PDF
    Background: One of the fundamental measures of molecular genetic variation is the Watterson's estimator (Θ), which is based on the number of segregating sites. The estimation of Θ is unbiased only under neutrality and constant population growth. It is well known that the estimation of Θ is biased when these assumptions are violated. However, the effects of sample size in modulating the bias was not well appreciated. Results: We examined this issue in detail based on large-scale exome data and robust simulations. Our investigation revealed that sample size appreciably influences Θ estimation and this effect was much higher for constrained genomic regions than that of neutral regions. For instance, Θ estimated for synonymous sites using 512 human exomes was 1.9 times higher than that obtained using 16 exomes. However, this difference was 2.5 times for the nonsynonymous sites of the same data. We observed a positive correlation between the rate of increase in Θ estimates (with respect to the sample size) and the magnitude of selection pressure. For example, Θ estimated for the nonsynonymous sites of highly constrained genes (dN/dS < 0.1) using 512 exomes was 3.6 times higher than that estimated using 16 exomes. In contrast this difference was only 2 times for the less constrained genes (dN/dS > 0.9). Conclusions: The results of this study reveal the extent of underestimation owing to small sample sizes and thus emphasize the importance of sample size in estimating a number of population genomic parameters. Our results have serious implications for neutrality tests such as Tajima D, Fu-Li D and those based on the McDonald and Kreitman test: Neutrality Index and the fraction of adaptive substitutions. For instance, use of 16 exomes produced 2.4 times higher proportion of adaptive substitutions compared to that obtained using 512 exomes (24 % vs 10 %). © 2016 Subramanian

    Genomic patterns in the widespread Eurasian lynx shaped by Late Quaternary climatic fluctuations and anthropogenic impacts.

    Get PDF
    Disentangling the contribution of long-term evolutionary processes and recent anthropogenic impacts to current genetic patterns of wildlife species is key for assessing genetic risks and designing conservation strategies. Here, we used 80 whole nuclear genomes and 96 mitogenomes from populations of the Eurasian lynx covering a range of conservation statuses, climatic zones and subspecies across Eurasia to infer the demographic history, reconstruct genetic patterns and discuss the influence of long-term isolation and/or more recent human-driven changes. Our results show that Eurasian lynx populations shared a common history until 100 kya, when Asian and European populations started to diverge and both entered a period of continuous and widespread decline, with western populations, except Kirov, maintaining lower effective sizes than eastern populations. Population declines and increased isolation in more recent times likely drove the genetic differentiation between geographically and ecologically close westernmost European populations. By contrast, and despite the wide range of habitats covered, populations are quite homogeneous genetically across the Asian range, showing a pattern of isolation by distance and providing little genetic support for the several proposed subspecies. Mitogenomic and nuclear divergences and population declines starting during the Late Pleistocene can be mostly attributed to climatic fluctuations and early human influence, but the widespread and sustained decline since the Holocene is more probably the consequence of anthropogenic impacts which intensified during the last centuries, especially in western Europe. Genetic erosion in isolated European populations and lack of evidence for long-term isolation argue for the restoration of lost population connectivity

    Evolutionary and population genomic analyses of the Zymoseptoria species complex

    Get PDF
    Plant pathogenic fungi in agricultural environments have to adapt to rapidly changing environments. Comparative analyses of closely related pathogens adapted to natural or agricultural ecosystems allow gaining more profound knowledge about the evolutionary responses that play a role in the adaptation to these differing environments. The members of the Zymoseptoria species complex comprise an excellent model system for comparative analyses since they either are specialized on wheat (Z. tritici) or associated with wild plant species (Z. ardabiliae, Z. brevis, Z. pseudotritici). Using methods of comparative genomics and molecular genetics this thesis shows that the Zymoseptoria species have unusual genome plasticity, which is characterized by the occurrence of accessory chromosomes and by structural changes associated with repetitive elements. Further, the results of population genomic analyses show that genetic diversity is much higher in the wheat pathogen Z. tritici compared to the wild grass associated pathogens Z. ardabiliae and Z. brevis. The results of this thesis connect the observed differences in genetic diversity levels between the species with differences in demographic events during their evolution. Genome-wide analyses of signatures of selection reveal a much higher number of genes with signatures of positive selection in the wheat pathogen Z. tritici compared to Z. ardabiliae and Z. brevis. Further, an ABC transporter candidate shows signatures of positive directional and positive diversifying selection in Z. tritici. Previous already studies connected ABC transporters with virulence and drug resistance in this species making this ABC transporter candidate an interesting candidate for future analyses. In this thesis, the detection of effector candidates, a class of proteins that are known to be important for modulating the immune system of the host, indicates the importance of species-specific effectors on host adaptation

    Distinguishing between recent balancing selection and incomplete sweep using deep neural networks

    Get PDF
    Balancing selection is an important adaptive mechanism underpinning a wide range of phenotypes. Despite its relevance, the detection of recent balancing selection from genomic data is challenging as its signatures are qualitatively similar to those left by ongoing positive selection. In this study, we developed and implemented two deep neural networks and tested their performance to predict loci under recent selection, either due to balancing selection or incomplete sweep, from population genomic data. Specifically, we generated forward-in-time simulations to train and test an artificial neural network (ANN) and a convolutional neural network (CNN). ANN received as input multiple summary statistics calculated on the locus of interest, while CNN was applied directly on the matrix of haplotypes. We found that both architectures have high accuracy to identify loci under recent selection. CNN generally outperformed ANN to distinguish between signals of balancing selection and incomplete sweep and was less affected by incorrect training data. We deployed both trained networks on neutral genomic regions in European populations and demonstrated a lower false-positive rate for CNN than ANN. We finally deployed CNN within the MEFV gene region and identified several common variants predicted to be under incomplete sweep in a European population. Notably, two of these variants are functional changes and could modulate susceptibility to familial Mediterranean fever, possibly as a consequence of past adaptation to pathogens. In conclusion, deep neural networks were able to characterize signals of selection on intermediate frequency variants, an analysis currently inaccessible by commonly used strategies

    Statistical inference of complex demographic models in Drosophila melanogaster and two wild tomato species

    Get PDF
    The aim of this thesis was to use the genealogical information contained in genetic variation profiles of natural populations to describe the evolution of a particular species. In the first project we analysed the colonization process that brought Drosophila melanogaster from Africa to Asia. Southeast Asian populations of the fruit fly D. melanogaster differ from ancestral African and derived European populations by several morphological characteristics. It has been argued that this morphological differentiation could be the result of an early colonization of Southeast Asia that predated the migration of D. melanogaster to Europe after the last glacial period (around 10,000 years ago). To investigate the colonization process of Southeast Asia, we collected nucleotide polymorphism data for 200 X-linked and 50 autosomal loci from a population of Malaysia. We analysed this new SNP dataset jointly with already existing data from an African and a European population by employing an Approximate Bayesian Computation (ABC) approach. By contrasting different demographic models of these three populations, we do not find any evidence for an early divergence between the African and the Asian populations. Rather, we show that Asian and European populations of D. melanogaster share a non-African most recent common ancestor (MRCA) that existed about 2500 years ago. The second project of my PhD thesis is an analysis of the importance of seed dormancy at the population level in two wild tomato species. Seed banks, that is, plant seeds remaining in soils for several generations before germination, are of practical importance in conservation biology because they diminish the immediate ecological impact of habitat fragmentation and prevent species extinction. From an evolutionary perspective, seed banks increase the genetic diversity of plant populations and buffer the effect of varying climatic conditions by magnifying the effects of good years and by dampening the effects of bad years. In this study we estimate the germination rates for two wild tomato species (Solanum chilense and Solanum peruvianum) found in western South-America in a wide range of habitats by using DNA sequences coupled to a coalescent model in combination with ecological data. We develop an ABC framework to integrate ecological information on above ground population census sizes, in order to estimate seed bank and metapopulation parameters for each species. We provide the first evidence that it is possible to disentangle the effect of the metapopulation structure from that of the seed bank on the effective population size and to obtain accurate estimates of germination rates based on a coalescent model. The third and last project of this thesis is related to the development of a computational tool that facilitates the analysis of nucleotide polymorphism datasets in an ABC framework. With the availability of whole-genome sequence data, biologists are able to test hypotheses regarding the demography of populations. Furthermore, the advancement of the ABC methodology allows the demographic inference to be performed in a simple framework using summary statistics. We present here msABC, a coalescent-based software that facilitates the simulation of multi-locus data, suitable for an ABC analysis. msABC is based on Hudson's ms algorithm, which is used extensively for simulating neutral demographic histories of populations. The flexibility of the original algorithm has been extended so that sample size may vary among loci, missing data can be incorporated in simulations and calculations, and a multitude of summary statistics for single or multiple populations is generated. The source code of msABC is available at http://bio.lmu.de/~pavlidis/msabc
    • …
    corecore