70,492 research outputs found
Selective Sampling with Drift
Recently there has been much work on selective sampling, an online active
learning setting, in which algorithms work in rounds. On each round an
algorithm receives an input and makes a prediction. Then, it can decide whether
to query a label, and if so to update its model, otherwise the input is
discarded. Most of this work is focused on the stationary case, where it is
assumed that there is a fixed target model, and the performance of the
algorithm is compared to a fixed model. However, in many real-world
applications, such as spam prediction, the best target function may drift over
time, or have shifts from time to time. We develop a novel selective sampling
algorithm for the drifting setting, analyze it under no assumptions on the
mechanism generating the sequence of instances, and derive new mistake bounds
that depend on the amount of drift in the problem. Simulations on synthetic and
real-world datasets demonstrate the superiority of our algorithms as a
selective sampling algorithm in the drifting setting
Concepts of Drift and Selection in âThe Great Snail Debateâ of the 1950s and Early 1960s
Recently, much philosophical discussion has centered on the best way to characterize the concepts of random drift and natural selection, and, in particular, on the question of whether selection and drift can be conceptually distinguished (Beatty 1984; Brandon 2005; Hodge 1983, 1987; Millstein 2002, 2005; Pfeifer 2005; Shanahan 1992; Stephens 2004). These authors all contend, to a greater or lesser degree, that their concepts make sense of biological practice. So, it should be instructive to see how the concepts of drift and selection were distinguished by the disputants in a high-profile debate; debates such as these often force biologists to take a more philosophical turn, discussing the concepts at issue in greater detail than usual. A prime candidate for just such a case study is what William Provine (1986) has termed âThe Great Snail Debate,â that is, the debate over the highly polymorphic land snails Cepaea nemoralis and Cepaea hortensis in the 1950s and early 1960s. This study will reveal that much of the present-day confusion over the concepts of drift and selection is rooted in confusions of the past. Nonetheless, there are lessons that can be learned about nonadaptiveness, indiscriminate sampling, and causality with respect to these two concepts. In particular, this paper will shed light on the following questions: 1) What is âdriftâ? Is âdriftâ a purely mathematical construct, a physical process analogous to the indiscriminate sampling of balls from an urn, or the outcome of a sampling process? 2) What is ânonadaptiveness,â and is a proponent of drift committed to claims that organismsâ traits are nonadaptive? 3) Can disputes concerning selection and drift be settled by statistics alone, or is causal information essential? If causal information is essential, what does that say about the concepts of âdriftâ and âselectionâ themselves
Session 4: Evolutionary Indeterminism
Proceedings of the Pittsburgh Workshop in History and Philosophy of Biology, Center for Philosophy of Science, University of Pittsburgh, March 23-24 2001 Session 4: Evolutionary Indeterminis
Coalescence 2.0: a multiple branching of recent theoretical developments and their applications
Population genetics theory has laid the foundations for genomics analyses
including the recent burst in genome scans for selection and statistical
inference of past demographic events in many prokaryote, animal and plant
species. Identifying SNPs under natural selection and underpinning species
adaptation relies on disentangling the respective contribution of random
processes (mutation, drift, migration) from that of selection on nucleotide
variability. Most theory and statistical tests have been developed using the
Kingman coalescent theory based on the Wright-Fisher population model. However,
these theoretical models rely on biological and life-history assumptions which
may be violated in many prokaryote, fungal, animal or plant species. Recent
theoretical developments of the so called multiple merger coalescent models are
reviewed here ({\Lambda}-coalescent, beta-coalescent, Bolthausen-Snitzman,
{\Xi}-coalescent). We explicit how these new models take into account various
pervasive ecological and biological characteristics, life history traits or
life cycles which were not accounted in previous theories such as 1) the skew
in offspring production typical of marine species, 2) fast adapting
microparasites (virus, bacteria and fungi) exhibiting large variation in
population sizes during epidemics, 3) the peculiar life cycles of fungi and
bacteria alternating sexual and asexual cycles, and 4) the high rates of
extinction-recolonization in spatially structured populations. We finally
discuss the relevance of multiple merger models for the detection of SNPs under
selection in these species, for population genomics of very large sample size
and advocate to potentially examine the conclusion of previous population
genetics studies.Comment: 3 Figure
The Effects of Population Size Histories on Estimates of Selection Coefficients from Time-Series Genetic Data.
Many approaches have been developed for inferring selection coefficients from time series data while accounting for genetic drift. These approaches have been motivated by the intuition that properly accounting for the population size history can significantly improve estimates of selective strengths. However, the improvement in inference accuracy that can be attained by modeling drift has not been characterized. Here, by comparing maximum likelihood estimates of selection coefficients that account for the true population size history with estimates that ignore drift by assuming allele frequencies evolve deterministically in a population of infinite size, we address the following questions: how much can modeling the population size history improve estimates of selection coefficients? How much can mis-inferred population sizes hurt inferences of selection coefficients? We conduct our analysis under the discrete Wright-Fisher model by deriving the exact probability of an allele frequency trajectory in a population of time-varying size and we replicate our results under the diffusion model. For both models, we find that ignoring drift leads to estimates of selection coefficients that are nearly as accurate as estimates that account for the true population history, even when population sizes are small and drift is high. This result is of interest because inference methods that ignore drift are widely used in evolutionary studies and can be many orders of magnitude faster than methods that account for population sizes
Inferring the Distribution of Selective Effects from a Time Inhomogeneous Model
We have developed a Poisson random field model for estimating the distribution of selective effects of newly arisen nonsynonymous mutations that could be observed as polymorphism or divergence in samples of two related species under the assumption that the two species populations are not at mutation-selection-drift equilibrium. The model is applied to 91Drosophila genes by comparing levels of polymorphism in an African population of D. melanogaster with divergence to a reference strain of D. simulans. Based on the difference of gene expression level between testes and ovaries, the 91 genes were classified as 33 male-biased, 28 female-biased, and 30 sex-unbiased genes. Under a Bayesian framework, Markov chain Monte Carlo simulations are implemented to the model in which the distribution of selective effects is assumed to be Gaussian with a mean that may differ from one gene to the other to sample key parameters. Based on our estimates, the majority of newly-arisen nonsynonymous mutations that could contribute to polymorphism or divergence in Drosophila species are mildly deleterious with a mean scaled selection coefficient of -2.81, while almost 86% of the fixed differences between species are driven by positive selection. There are only 16.6% of the nonsynonymous mutations observed in sex-unbiased genes that are under positive selection in comparison to 30% of male-biased and 46% of female-biased genes that are beneficial. We also estimated that D. melanogaster and D. simulans may have diverged 1.72 million years ago
Directional selection effects on patterns of phenotypic (co)variation in wild populations.
Phenotypic (co)variation is a prerequisite for evolutionary change, and understanding how (co)variation evolves is of crucial importance to the biological sciences. Theoretical models predict that under directional selection, phenotypic (co)variation should evolve in step with the underlying adaptive landscape, increasing the degree of correlation among co-selected traits as well as the amount of genetic variance in the direction of selection. Whether either of these outcomes occurs in natural populations is an open question and thus an important gap in evolutionary theory. Here, we documented changes in the phenotypic (co)variation structure in two separate natural populations in each of two chipmunk species (Tamias alpinus and T. speciosus) undergoing directional selection. In populations where selection was strongest (those of T. alpinus), we observed changes, at least for one population, in phenotypic (co)variation that matched theoretical expectations, namely an increase of both phenotypic integration and (co)variance in the direction of selection and a re-alignment of the major axis of variation with the selection gradient
Tradeoff between short-term and long-term adaptation in a changing environment
We investigate the competition dynamics of two microbial or viral strains
that live in an environment that switches periodically between two states. One
of the strains is adapted to the long-term environment, but pays a short-term
cost, while the other is adapted to the short-term environment and pays a cost
in the long term. We explore the tradeoff between these alternative strategies
in extensive numerical simulations, and present a simple analytic model that
can predict the outcome of these competitions as a function of the mutation
rate and the time scale of the environmental changes. Our model is relevant for
arboviruses, which alternate between different host species on a regular basis.Comment: 9 pages, 3 figures, PRE in pres
- âŠ