26 research outputs found
Expanding the stdpopsim species catalog, and lessons learned for realistic genome simulations
Simulation is a key tool in population genetics for both methods development and empirical research, but producing simulations that recapitulate the main features of genomic datasets remains a major obstacle. Today, more realistic simulations are possible thanks to large increases in the quantity and quality of available genetic data, and the sophistication of inference and simulation software. However, implementing these simulations still requires substantial time and specialized knowledge. These challenges are especially pronounced for simulating genomes for species that are not well-studied, since it is not always clear what information is required to produce simulations with a level of realism sufficient to confidently answer a given question. The community-developed framework stdpopsim seeks to lower this barrier by facilitating the simulation of complex population genetic models using up-to-date information. The initial version of stdpopsim focused on establishing this framework using six well-characterized model species (Adrion et al., 2020). Here, we report on major improvements made in the new release of stdpopsim (version 0.2), which includes a significant expansion of the species catalog and substantial additions to simulation capabilities. Features added to improve the realism of the simulated genomes include non-crossover recombination and provision of species-specific genomic annotations. Through community-driven efforts, we expanded the number of species in the catalog more than threefold and broadened coverage across the tree of life. During the process of expanding the catalog, we have identified common sticking points and developed the best practices for setting up genome-scale simulations. We describe the input data required for generating a realistic simulation, suggest good practices for obtaining the relevant information from the literature, and discuss common pitfalls and major considerations. These improvements to stdpopsim aim to further promote the use of realistic whole-genome population genetic simulations, especially in non-model organisms, making them available, transparent, and accessible to everyone
Recommended from our members
Biomolecular Dynamics and Function: A Study on Amino Acids and Enzymes
Proteins are biomolecules involved in cellular structure as well as function. These molecules are long chain polymers consisting of amino acids, which are organic compounds containing many different functional groups such as amine (−NH2) and carboxylic acids (−COOH). Actin proteins form part of the cellular structure, membrane proteins act as channels for transfer of ions, and enzymes catalyze critical cellular reactions. While structure is a well- appreciated determinant of function, the role of dynamics of proteins and solvent are less well studied. In my thesis work, I have studied the statistical fluctuations and dynamics of the basic amino acids in water and up to the full complexity of enzymes. The combination of experimental and computational techniques is a powerful combination for obtaining insight into dynamical events. I have used force-field based classical molecular dynamics simulations using an advanced polarizable force field to study the behaviour of these biomolecules in so- lution and have simulated experimental observables to understand conformational motions.In the first part of my thesis, I have characterized the dynamical modes of a basic pro- tein unit - a single zwitterionic amino acid in solution - to make quantitative comparisons to the low frequency Terahertz (THz) absorption spectra. An analysis protocol for decom- posing the THz absorption spectrum has been previously developed for analyzing zwitterion simulations performed using ab-initio molecular dynamics (AIMD). In this work we extend the analysis method to simulations performed by force field molecular dynamics, which are computationally far less intensive, and setting the stage for decomposing the THz spectra for larger proteins that are not affordable by AIMD. We also show that the main impact of the solvation on the dynamical modes of zwitterions comes from the first solvation shell around the zwitterion only, and presence of waters further out does not affect the dynamics of these molecules significantly.In the second part of my thesis work, I have explored the role of statistical fluctuations of solvation for artificial enzymes - which have poor activity - and have evaluated how the entropic features change upon mutation through laboratory directed evolution in which the enzymes show much greater activity. I have used two Kemp Eliminases (KE07 and KE70) and show that the active sites of these two enzymes have starkly contrasting interactions with solvent. KE07 incorporates the water into the active site to enhance the catalysis of the Kemp Elimination reaction while KE70 creates a strong hydrophobic pocket leading to the catalysis being driven entirely by the protein residues at the active site. Different entropic species of waters based on their vibrational dynamics are identified, and we observe varying behaviour of waters between mutants, as well as with the presence of the ligand.In the final part of my thesis, I have looked at the dynamical correlations between residues in KE07 and have evaluated how the dynamical correlations change upon mutation through laboratory directed evolution. In particular, I have characterized the residue-residue inter- actions where we find that there is correlated motion between surface loops in KE07, which potentially could modulate access of the ligand to the active site of the enzyme; we observe that the binding of the ligand increases the correlation between the residues of the protein in the higher performing variants of the enzyme
Mode specific THz spectra of solvated amino acids using the AMOEBA polarizable force field.
We have used the AMOEBA model to simulate the THz spectra of two zwitterionic amino acids in aqueous solution, which is compared to the results on these same systems using ab initio molecular dynamics (AIMD) simulations. Overall we find that the polarizable force field shows promising agreement with AIMD data for both glycine and valine in water. This includes the THz spectral assignments and the mode-specific spectral decomposition into intramolecular solute motions as well as distinct solute-water cross-correlation modes some of which cannot be captured by non-polarizable force fields that rely on fixed partial charges. This bodes well for future studies for simulating and decomposing the THz spectra for larger solutes such as proteins or polymers for which AIMD studies are presently intractable. Furthermore, we believe that the current study on rather simple aqueous solutions offers a way to systematically investigate the importance of charge transfer, nuclear quantum effects, and the validity of computationally practical density functionals, all of which are needed to fully quantitatively capture complex dynamical motions in the condensed phase
Recommended from our members
Mode specific THz spectra of solvated amino acids using the AMOEBA polarizable force field.
We have used the AMOEBA model to simulate the THz spectra of two zwitterionic amino acids in aqueous solution, which is compared to the results on these same systems using ab initio molecular dynamics (AIMD) simulations. Overall we find that the polarizable force field shows promising agreement with AIMD data for both glycine and valine in water. This includes the THz spectral assignments and the mode-specific spectral decomposition into intramolecular solute motions as well as distinct solute-water cross-correlation modes some of which cannot be captured by non-polarizable force fields that rely on fixed partial charges. This bodes well for future studies for simulating and decomposing the THz spectra for larger solutes such as proteins or polymers for which AIMD studies are presently intractable. Furthermore, we believe that the current study on rather simple aqueous solutions offers a way to systematically investigate the importance of charge transfer, nuclear quantum effects, and the validity of computationally practical density functionals, all of which are needed to fully quantitatively capture complex dynamical motions in the condensed phase
Recommended from our members
Solvent Entropy Contributions to Catalytic Activity in Designed and Optimized Kemp Eliminases.
We analyze the role of solvation for enzymatic catalysis in two distinct, artificially designed Kemp Eliminases, KE07 and KE70, and mutated variants that were optimized by laboratory directed evolution. Using a spatially resolved analysis of hydration patterns, intermolecular vibrations, and local solvent entropies, we identify distinct classes of hydration water and follow their changes upon substrate binding and transition state formation for the designed KE07 and KE70 enzymes and their evolved variants. We observe that differences in hydration of the enzymatic systems are concentrated in the active site and undergo significant changes during substrate recruitment. For KE07, directed evolution reduces variations in the hydration of the polar catalytic center upon substrate binding, preserving strong protein-water interactions, while the evolved enzyme variant of KE70 features a more hydrophobic reaction center for which the expulsion of low-entropy water molecules upon substrate binding is substantially enhanced. While our analysis indicates a system-dependent role of solvation for the substrate binding process, we identify more subtle changes in solvation for the transition state formation, which are less affected by directed evolution
Solvent Entropy Contributions to Catalytic Activity in Designed and Optimized Kemp Eliminases
We analyze the role of solvation
for enzymatic catalysis in two
distinct, artificially designed Kemp Eliminases, KE07 and KE70, and
mutated variants that were optimized by laboratory directed evolution.
Using a spatially resolved analysis of hydration patterns, intermolecular
vibrations, and local solvent entropies, we identify distinct classes
of hydration water and follow their changes upon substrate binding
and transition state formation for the designed KE07 and KE70 enzymes
and their evolved variants. We observe that differences in hydration
of the enzymatic systems are concentrated in the active site and undergo
significant changes during substrate recruitment. For KE07, directed
evolution reduces variations in the hydration of the polar catalytic
center upon substrate binding, preserving strong protein–water
interactions, while the evolved enzyme variant of KE70 features a
more hydrophobic reaction center for which the expulsion of low-entropy
water molecules upon substrate binding is substantially enhanced.
While our analysis indicates a system-dependent role of solvation
for the substrate binding process, we identify more subtle changes
in solvation for the transition state formation, which are less affected
by directed evolution
Recommended from our members
Analysis of 100 high-coverage genomes from a pedigreed captive baboon colony.
Baboons (genus Papio) are broadly studied in the wild and in captivity. They are widely used as a nonhuman primate model for biomedical studies, and the Southwest National Primate Research Center (SNPRC) at Texas Biomedical Research Institute has maintained a large captive baboon colony for more than 50 yr. Unlike other model organisms, however, the genomic resources for baboons are severely lacking. This has hindered the progress of studies using baboons as a model for basic biology or human disease. Here, we describe a data set of 100 high-coverage whole-genome sequences obtained from the mixed colony of olive (P. anubis) and yellow (P. cynocephalus) baboons housed at the SNPRC. These data provide a comprehensive catalog of common genetic variation in baboons, as well as a fine-scale genetic map. We show how the data can be used to learn about ancestry and admixture and to correct errors in the colony records. Finally, we investigated the consequences of inbreeding within the SNPRC colony and found clear evidence for increased rates of infant mortality and increased homozygosity of putatively deleterious alleles in inbred individuals