57,540 research outputs found
First-principles molecular structure search with a genetic algorithm
The identification of low-energy conformers for a given molecule is a
fundamental problem in computational chemistry and cheminformatics. We assess
here a conformer search that employs a genetic algorithm for sampling the
low-energy segment of the conformation space of molecules. The algorithm is
designed to work with first-principles methods, facilitated by the
incorporation of local optimization and blacklisting conformers to prevent
repeated evaluations of very similar solutions. The aim of the search is not
only to find the global minimum, but to predict all conformers within an energy
window above the global minimum. The performance of the search strategy is: (i)
evaluated for a reference data set extracted from a database with amino acid
dipeptide conformers obtained by an extensive combined force field and
first-principles search and (ii) compared to the performance of a systematic
search and a random conformer generator for the example of a drug-like ligand
with 43 atoms, 8 rotatable bonds and 1 cis/trans bond
Scaling laws in bacterial genomes: A side-effect of selection of mutational robustness?
In the past few years, numerous research projects have focused on identifying and understanding scaling properties in the gene content of prokaryote genomes and the intricacy of their regulation networks. Yet, and despite the increasing amount of data available, the origins of these scalings remain an open question. The RAevol model, a digital genetics model, provides us with an insight into the mechanisms involved in an evolutionary process. The results we present here show that (i) our model reproduces qualitatively these scaling laws and that (ii) these laws are not due to differences in lifestyles but to differences in the spontaneous rates of mutations and rearrangements. We argue that this is due to an indirect selective pressure for robustness that constrains the genome size
Statistical mechanics and thermodynamics of viral evolution
This paper analyzes a simplified model of viral infection and evolution using
the 'grand canonical ensemble' and formalisms from statistical mechanics and
thermodynamics to enumerate all possible viruses and to derive thermodynamic
variables for the system. We model the infection process as a series of energy
barriers determined by the genetic states of the virus and host as a function
of immune response and system temperature. We find a phase transition between a
positive temperature regime of normal replication and a negative temperature
'disordered' phase of the virus. These phases define different regimes in which
different genetic strategies are favored. Perhaps most importantly, it
demonstrates that the system has a real thermodynamic temperature. For normal
replication, this temperature is linearly related to effective temperature. The
strength of immune response rescales temperature but does not change the
observed linear relationship. For all temperatures and immunities studied, we
find a universal curve relating the order parameter to viral evolvability. Real
viruses have finite length RNA segments that encode for proteins which
determine their fitness; hence the methods put forth here could be refined to
apply to real biological systems, perhaps providing insight into immune escape,
the emergence of novel pathogens and other results of viral evolution.Comment: 39 pages (55 pages including supplement), 9 figures, 11 supplemental
figure
On the design of an ECOC-compliant genetic algorithm
Genetic Algorithms (GA) have been previously applied to Error-Correcting Output Codes (ECOC) in state-of-the-art works in order to find a suitable coding matrix. Nevertheless, none of the presented techniques directly take into account the properties of the ECOC matrix. As a result the considered search space is unnecessarily large. In this paper, a novel Genetic strategy to optimize the ECOC coding step is presented. This novel strategy redefines the usual crossover and mutation operators in order to take into account the theoretical properties of the ECOC framework. Thus, it reduces the search space and lets the algorithm to converge faster. In addition, a novel operator that is able to enlarge the code in a smart way is introduced. The novel methodology is tested on several UCI datasets and four challenging computer vision problems. Furthermore, the analysis of the results done in terms of performance, code length and number of Support Vectors shows that the optimization process is able to find very efficient codes, in terms of the trade-off between classification performance and the number of classifiers. Finally, classification performance per dichotomizer results shows that the novel proposal is able to obtain similar or even better results while defining a more compact number of dichotomies and SVs compared to state-of-the-art approaches
Genetic programming: the ratio of crossover to mutation as a function of time
This article studies the sub-tree operators: mutation and crossover, within
the context of Genetic Programming. Two standard problems, symbolic linear
regression and a non-linear tree, were presented to the algorithm at each stage.
The behaviour of the operators in regard to fitness is first established, followed
by an analysis of the most optimal ratio between crossover and mutation.
Subsequently, three algorithms are presented as candidates to dynamically
learn the most optimal level of this ratio. The results of each algorithm are
then compared to each other and the traditional constant ratio
Synthesis of Parametric Programs using Genetic Programming and Model Checking
Formal methods apply algorithms based on mathematical principles to enhance
the reliability of systems. It would only be natural to try to progress from
verification, model checking or testing a system against its formal
specification into constructing it automatically. Classical algorithmic
synthesis theory provides interesting algorithms but also alarming high
complexity and undecidability results. The use of genetic programming, in
combination with model checking and testing, provides a powerful heuristic to
synthesize programs. The method is not completely automatic, as it is fine
tuned by a user that sets up the specification and parameters. It also does not
guarantee to always succeed and converge towards a solution that satisfies all
the required properties. However, we applied it successfully on quite
nontrivial examples and managed to find solutions to hard programming
challenges, as well as to improve and to correct code. We describe here several
versions of our method for synthesizing sequential and concurrent systems.Comment: In Proceedings INFINITY 2013, arXiv:1402.661
Hybrid Iterative Multiuser Detection for Channel Coded Space Division Multiple Access OFDM Systems
Space division multiple access (SDMA) aided orthogonal frequency division multiplexing (OFDM) systems assisted by efficient multiuser detection (MUD) techniques have recently attracted intensive research interests. The maximum likelihood detection (MLD) arrangement was found to attain the best performance, although this was achieved at the cost of a computational complexity, which increases exponentially both with the number of users and with the number of bits per symbol transmitted by higher order modulation schemes. By contrast, the minimum mean-square error (MMSE) SDMA-MUD exhibits a lower complexity at the cost of a performance loss. Forward error correction (FEC) schemes such as, for example, turbo trellis coded modulation (TTCM), may be efficiently combined with SDMA-OFDM systems for the sake of improving the achievable performance. Genetic algorithm (GA) based multiuser detection techniques have been shown to provide a good performance in MUD-aided code division multiple access (CDMA) systems. In this contribution, a GA-aided MMSE MUD is proposed for employment in a TTCM assisted SDMA-OFDM system, which is capable of achieving a similar performance to that attained by its optimum MLD-aided counterpart at a significantly lower complexity, especially at high user loads. Moreover, when the proposed biased Q-function based mutation (BQM) assisted iterative GA (IGA) MUD is employed, the GA-aided system’s performance can be further improved, for example, by reducing the bit error ratio (BER) measured at 3 dB by about five orders of magnitude in comparison to the TTCM assisted MMSE-SDMA-OFDM benchmarker system, while still maintaining modest complexity
- …