4,683 research outputs found
Incorporating ancestors' influence in genetic algorithms
A new criterion of fitness evaluation for Genetic Algorithms is introduced where the fitness value of an individual is determined by considering its own fitness as well as those of its ancestors. Some guidelines for selecting the weighting coefficients for quantifying the importance to be given to the fitness of the individual and its ancestors are provided. This is done both heuristically and automatically under fixed and adaptive frameworks. The Schema Theorem corresponding to the proposed concept is derived. The effectiveness of this new methodology is demonstrated extensively on the problems of optimizing complex functions including a noisy one and selecting optimal neural network parameters
Scalability of Genetic Programming and Probabilistic Incremental Program Evolution
This paper discusses scalability of standard genetic programming (GP) and the
probabilistic incremental program evolution (PIPE). To investigate the need for
both effective mixing and linkage learning, two test problems are considered:
ORDER problem, which is rather easy for any recombination-based GP, and TRAP or
the deceptive trap problem, which requires the algorithm to learn interactions
among subsets of terminals. The scalability results show that both GP and PIPE
scale up polynomially with problem size on the simple ORDER problem, but they
both scale up exponentially on the deceptive problem. This indicates that while
standard recombination is sufficient when no interactions need to be
considered, for some problems linkage learning is necessary. These results are
in agreement with the lessons learned in the domain of binary-string genetic
algorithms (GAs). Furthermore, the paper investigates the effects of
introducing utnnecessary and irrelevant primitives on the performance of GP and
PIPE.Comment: Submitted to GECCO-200
Making and breaking power laws in evolutionary algorithm population dynamics
Deepening our understanding of the characteristics and behaviors of population-based search algorithms remains an important ongoing challenge in Evolutionary Computation. To date however, most studies of Evolutionary Algorithms have only been able to take place within tightly restricted experimental conditions. For instance, many analytical methods can only be applied to canonical algorithmic forms or can only evaluate evolution over simple test functions. Analysis of EA behavior under more complex conditions is needed to broaden our understanding of this population-based search process. This paper presents an approach to analyzing EA behavior that can be applied to a diverse range of algorithm designs and environmental conditions. The approach is based on evaluating an individual’s impact on population dynamics using metrics derived from genealogical graphs.\ud
From experiments conducted over a broad range of conditions, some important conclusions are drawn in this study. First, it is determined that very few individuals in an EA population have a significant influence on future population dynamics with the impact size fitting a power law distribution. The power law distribution indicates there is a non-negligible probability that single individuals will dominate the entire population, irrespective of population size. Two EA design features are however found to cause strong changes to this aspect of EA behavior: i) the population topology and ii) the introduction of completely new individuals. If the EA population topology has a long path length or if new (i.e. historically uncoupled) individuals are continually inserted into the population, then power law deviations are observed for large impact sizes. It is concluded that such EA designs can not be dominated by a small number of individuals and hence should theoretically be capable of exhibiting higher degrees of parallel search behavior
Computing Individual Risks based on Family History in Genetic Disease in the Presence of Competing Risks
When considering a genetic disease with variable age at onset (ex: diabetes ,
familial amyloid neuropathy, cancers, etc.), computing the individual risk of
the disease based on family history (FH) is of critical interest both for
clinicians and patients. Such a risk is very challenging to compute because: 1)
the genotype X of the individual of interest is in general unknown; 2) the
posterior distribution P(X|FH, T > t) changes with t (T is the age at disease
onset for the targeted individual); 3) the competing risk of death is not
negligible. In this work, we present a modeling of this problem using a
Bayesian network mixed with (right-censored) survival outcomes where hazard
rates only depend on the genotype of each individual. We explain how belief
propagation can be used to obtain posterior distribution of genotypes given the
FH, and how to obtain a time-dependent posterior hazard rate for any individual
in the pedigree. Finally, we use this posterior hazard rate to compute
individual risk, with or without the competing risk of death. Our method is
illustrated using the Claus-Easton model for breast cancer (BC). This model
assumes an autosomal dominant genetic risk factor such as non-carriers
(genotype 00) have a BC hazard rate 0 (t) while carriers (genotypes
01, 10 and 11) have a (much greater) hazard rate 1 (t). Both hazard
rates are assumed to be piecewise constant with known values (cuts at 20, 30,.
.. , 80 years). The competing risk of death is derived from the national French
registry
Tree-guided group lasso for multi-response regression with structured sparsity, with an application to eQTL mapping
We consider the problem of estimating a sparse multi-response regression
function, with an application to expression quantitative trait locus (eQTL)
mapping, where the goal is to discover genetic variations that influence
gene-expression levels. In particular, we investigate a shrinkage technique
capable of capturing a given hierarchical structure over the responses, such as
a hierarchical clustering tree with leaf nodes for responses and internal nodes
for clusters of related responses at multiple granularity, and we seek to
leverage this structure to recover covariates relevant to each
hierarchically-defined cluster of responses. We propose a tree-guided group
lasso, or tree lasso, for estimating such structured sparsity under
multi-response regression by employing a novel penalty function constructed
from the tree. We describe a systematic weighting scheme for the overlapping
groups in the tree-penalty such that each regression coefficient is penalized
in a balanced manner despite the inhomogeneous multiplicity of group
memberships of the regression coefficients due to overlaps among groups. For
efficient optimization, we employ a smoothing proximal gradient method that was
originally developed for a general class of structured-sparsity-inducing
penalties. Using simulated and yeast data sets, we demonstrate that our method
shows a superior performance in terms of both prediction errors and recovery of
true sparsity patterns, compared to other methods for learning a
multivariate-response regression.Comment: Published in at http://dx.doi.org/10.1214/12-AOAS549 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Healthcare-associated outbreak of meticillin-resistant Staphylococcus aureus bacteraemia: role of a cryptic variant of an epidemic clone
BACKGROUND
New strains of meticillin-resistant Staphylococcus aureus (MRSA) may be associated with changes in rates of disease or clinical presentation. Conventional typing techniques may not detect new clonal variants that underlie changes in epidemiology or clinical phenotype.
AIM
To investigate the role of clonal variants of MRSA in an outbreak of MRSA bacteraemia at a hospital in England.
METHODS
Bacteraemia isolates of the major UK lineages (EMRSA-15 and -16) from before and after the outbreak were analysed by whole-genome sequencing in the context of epidemiological and clinical data. For comparison, EMRSA-15 and -16 isolates from another hospital in England were sequenced. A clonal variant of EMRSA-16 was identified at the outbreak hospital and a molecular signature test designed to distinguish variant isolates among further EMRSA-16 strains.
FINDINGS
By whole-genome sequencing, EMRSA-16 isolates during the outbreak showed strikingly low genetic diversity (P < 1 × 10(-6), Monte Carlo test), compared with EMRSA-15 and EMRSA-16 isolates from before the outbreak or the comparator hospital, demonstrating the emergence of a clonal variant. The variant was indistinguishable from the ancestral strain by conventional typing. This clonal variant accounted for 64/72 (89%) of EMRSA-16 bacteraemia isolates at the outbreak hospital from 2006.
CONCLUSIONS
Evolutionary changes in epidemic MRSA strains not detected by conventional typing may be associated with changes in disease epidemiology. Rapid and affordable technologies for whole-genome sequencing are becoming available with the potential to identify and track the emergence of variants of highly clonal organisms
Bayesian total evidence dating reveals the recent crown radiation of penguins
The total-evidence approach to divergence-time dating uses molecular and
morphological data from extant and fossil species to infer phylogenetic
relationships, species divergence times, and macroevolutionary parameters in a
single coherent framework. Current model-based implementations of this approach
lack an appropriate model for the tree describing the diversification and
fossilization process and can produce estimates that lead to erroneous
conclusions. We address this shortcoming by providing a total-evidence method
implemented in a Bayesian framework. This approach uses a mechanistic tree
prior to describe the underlying diversification process that generated the
tree of extant and fossil taxa. Previous attempts to apply the total-evidence
approach have used tree priors that do not account for the possibility that
fossil samples may be direct ancestors of other samples. The fossilized
birth-death (FBD) process explicitly models the diversification, fossilization,
and sampling processes and naturally allows for sampled ancestors. This model
was recently applied to estimate divergence times based on molecular data and
fossil occurrence dates. We incorporate the FBD model and a model of
morphological trait evolution into a Bayesian total-evidence approach to dating
species phylogenies. We apply this method to extant and fossil penguins and
show that the modern penguins radiated much more recently than has been
previously estimated, with the basal divergence in the crown clade occurring at
~12.7 Ma and most splits leading to extant species occurring in the last 2
million years. Our results demonstrate that including stem-fossil diversity can
greatly improve the estimates of the divergence times of crown taxa. The method
is available in BEAST2 (v. 2.4) www.beast2.org with packages SA (v. at least
1.1.4) and morph-models (v. at least 1.0.4).Comment: 50 pages, 6 figure
- …