5 research outputs found
Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone
Effects of intraspecific hybridization on the fitness of the egg parasitoid Trichogramma galloi
International audienceSuccessive rearing in laboratory conditions can result in the loss of genetic diversity, inbreeding depression and adaptation to the captive environment, affecting the quality of the insects reared and compromising their field performance. Introduction of genetic variation by admixing different populations may increase the fitness of populations, minimizing the negative effects of rearing many generations in artificial conditions. We experimentally investigated the role of intraspecific hybridization in enhancing the fitness of the egg parasitoid Trichogramma galloi Zucchi, 1988 (Hymenoptera: Trichogrammatidae), by reciprocally crossing three populations. Our results showed that the mating type did not affect the number of crosses that produced viable daughters. Homotypic crosses produced 94% viable daughters, while heterotypic crosses produced 92%. There were neither mating incompatibilities nor reproductive barriers between these populations. However, we observed a low fitness value for females from one of the populations studied. The fitness of hybrids was either unchanged or improved (in one case) when compared to the parental populations. We discuss the implications of our results and suggest future research directions
Joint inference of demography and selection from temporal population genomic data via approximate Bayesian computation
Contemporary genetic data has been extensively used to infer demographic and adaptive history of populations. However, those inferences integrate effects over large periods of time and are often uninformative about the recent events that will be more relevant for the management of populations. In contrast, temporal genetic data, such as those obtained from ancient and modern samples or from monitoring surveys, can provide information on the recent evolutionary history of the target population. At present, there are some statistical methods available to make inferences from temporal population genetic data, but most of them suffer from two limitations: (1) demographic inference ignores the effects of linked selection which, in some cases, can produce biases in the inference; and (2) inference of selection focuses on loci with large effects that produce outlier patterns but that provide little information about the adaptive potential and viability of the population. In this work we propose a simulation-based approach (approximate Bayesian computation, ABC) to address those limitations. Using individual-based forward in time simulations we are able to model multi-locus selection processes and their effect on whole-genome diversity. In each simulation, latent variables (effective population size and genetic load) are calculated that integrate demographic and selective information in a way that is not captured by the original parameters of the model (e.g. census population size). Simulations are used to generate a training data set and Random Forest ABC are used to learn about the demographic and selective parameters and variables from the genetic diversity patterns. The performance of the inference is evaluated via out-of-bag estimates. The results show this is a promising approach for the joint inference of demography and selection. Inference of effective population size is accurate even in scenarios with pervasive selection were naive estimators show significant bias. As an example, the method is applied to a data set of feral populations of European bee in North America (modern samples and museum specimens), with results congruent with the known biology of the species
Recommended from our members
Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone
Recommended from our members
Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations.
Peer reviewed: TrueAcknowledgements: We wish to thank the dozens of workshop attendees, and especially the two dozen or so hackathon participants, whose combined feedback motivated many of the updates made to stdpopsim in the past two years.Funder: Robertson Foundation; FundRef: http://dx.doi.org/10.13039/100013961Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone