12,628 research outputs found
Rapid forward-in-time simulation at the chromosome and genome level
Background: In population genetics, simulation is a fundamental tool for analyzing how basic evolutionary forces such as natural selection, recombination, and mutation shape the genetic landscape of a population. Forward simulation represents the most powerful, but, at the same time, most compute-intensive approach for simulating the genetic material of a population.
Results: We introduce AnA-FiTS, a highly optimized forward simulation software, that is up to two orders of magnitude faster than current state-of-the-art software. In addition, we present a novel algorithm that further improves runtimes by up to an additional order of magnitude, for simulations where a fraction of the mutations is neutral (e.g., only 10% of mutations have an effect on fitness). Apart from simulated sequences, our tool also generates a graph structure that depicts the complete observable history of neutral mutations.
Conclusions: The substantial performance improvements allow for conducting forward simulations at the chromosome and genome level. The graph structure generated by our algorithm can give rise to novel approaches for visualizing and analyzing the output of forward simulations
XSim: Simulation of Descendants from Ancestors with Sequence Data.
Real or imputed high-density SNP genotypes are routinely used for genomic prediction and genome-wide association studies. Many researchers are moving toward the use of actual or imputed next-generation sequence data in whole-genome analyses. Simulation studies are useful to mimic complex scenarios and test different analytical methods. We have developed the software tool XSim to efficiently simulate sequence data in descendants in arbitrary pedigrees. In this software, a strategy to drop-down origins and positions of chromosomal segments rather than every allele state is implemented to simulate sequence data and to accommodate complicated pedigree structures across multiple generations. Both C++ and Julia versions of XSim have been developed
Human-chimpanzee alignment: Ortholog Exponentials and Paralog Power Laws
Genomic subsequences conserved between closely related species such as human
and chimpanzee exhibit an exponential length distribution, in contrast to the
algebraic length distribution observed for sequences shared between distantly
related genomes. We find that the former exponential can be further decomposed
into an exponential component primarily composed of orthologous sequences, and
a truncated algebraic component primarily composed of paralogous sequences.Comment: Main text: 31 pages, 13 figures, 1 table; Supplementary materials: 9
pages, 9 figures, 1 tabl
Natural selection reduced diversity on human Y chromosomes
The human Y chromosome exhibits surprisingly low levels of genetic diversity.
This could result from neutral processes if the effective population size of
males is reduced relative to females due to a higher variance in the number of
offspring from males than from females. Alternatively, selection acting on new
mutations, and affecting linked neutral sites, could reduce variability on the
Y chromosome. Here, using genome-wide analyses of X, Y, autosomal and
mitochondrial DNA, in combination with extensive population genetic
simulations, we show that low observed Y chromosome variability is not
consistent with a purely neutral model. Instead, we show that models of
purifying selection are consistent with observed Y diversity. Further, the
number of sites estimated to be under purifying selection greatly exceeds the
number of Y-linked coding sites, suggesting the importance of the highly
repetitive ampliconic regions. While we show that purifying selection removing
deleterious mutations can explain the low diversity on the Y chromosome, we
cannot exclude the possibility that positive selection acting on beneficial
mutations could have also reduced diversity in linked neutral regions, and may
have contributed to lowering human Y chromosome diversity. Because the
functional significance of the ampliconic regions is poorly understood, our
findings should motivate future research in this area.Comment: 43 pages, 11 figure
forqs: Forward-in-time Simulation of Recombination, Quantitative Traits, and Selection
forqs is a forward-in-time simulation of recombination, quantitative traits,
and selection. It was designed to investigate haplotype patterns resulting from
scenarios where substantial evolutionary change has taken place in a small
number of generations due to recombination and/or selection on polygenic
quantitative traits. forqs is implemented as a command- line C++ program.
Source code and binary executables for Linux, OSX, and Windows are freely
available under a permissive BSD license.Comment: preprint include Supplementary Information.
https://bitbucket.org/dkessner/forq
Coalescence, genetic diversity in sexual populations under selection
In sexual populations, selection operates neither on the whole genome, which
is repeatedly taken apart and reassembled by recombination, nor on individual
alleles that are tightly linked to the chromosomal neighborhood. The resulting
interference between linked alleles reduces the efficiency of selection and
distorts patterns of genetic diversity. Inference of evolutionary history from
diversity shaped by linked selection requires an understanding of these
patterns. Here, we present a simple but powerful scaling analysis identifying
the unit of selection as the genomic "linkage block" with a characteristic
length determined in a self-consistent manner by the condition that the rate of
recombination within the block is comparable to the fitness differences between
different alleles of the block. We find that an asexual model with the strength
of selection tuned to that of the linkage block provides an excellent
description of genetic diversity and the site frequency spectra when compared
to computer simulations. This linkage block approximation is accurate for the
entire spectrum of strength of selection and is particularly powerful in
scenarios with many weakly selected loci. The latter limit allows us to
characterize coalescence, genetic diversity, and the speed of adaptation in the
infinitesimal model of quantitative genetics
- …