83 research outputs found

    A Parallel Divide-and-Conquer based Evolutionary Algorithm for Large-scale Optimization

    Full text link
    Large-scale optimization problems that involve thousands of decision variables have extensively arisen from various industrial areas. As a powerful optimization tool for many real-world applications, evolutionary algorithms (EAs) fail to solve the emerging large-scale problems both effectively and efficiently. In this paper, we propose a novel Divide-and-Conquer (DC) based EA that can not only produce high-quality solution by solving sub-problems separately, but also highly utilizes the power of parallel computing by solving the sub-problems simultaneously. Existing DC-based EAs that were deemed to enjoy the same advantages of the proposed algorithm, are shown to be practically incompatible with the parallel computing scheme, unless some trade-offs are made by compromising the solution quality.Comment: 12 pages, 0 figure

    Understanding the Structural and Functional Importance of Early Folding Residues in Protein Structures

    Get PDF
    Proteins adopt three-dimensional structures which serve as a starting point to understand protein function and their evolutionary ancestry. It is unclear how proteins fold in vivo and how this process can be recreated in silico in order to predict protein structure from sequence. Contact maps are a possibility to describe whether two residues are in spatial proximity and structures can be derived from this simplified representation. Coevolution or supervised machine learning techniques can compute contact maps from sequence: however, these approaches only predict sparse subsets of the actual contact map. It is shown that the composition of these subsets substantially influences the achievable reconstruction quality because most information in a contact map is redundant. No strategy was proposed which identifies unique contacts for which no redundant backup exists. The StructureDistiller algorithm quantifies the structural relevance of individual contacts and identifies crucial contacts in protein structures. It is demonstrated that using this information the reconstruction performance on a sparse subset of a contact map is increased by 0.4 A, which constitutes a substantial performance gain. The set of the most relevant contacts in a map is also more resilient to false positively predicted contacts: up to 6% of false positives are compensated before reconstruction quality matches a naive selection of contacts without any false positive contacts. This information is invaluable for the training to new structure prediction methods and provides insights into how robustness and information content of contact maps can be improved. In literature, the relevance of two types of residues for in vivo folding has been described. Early folding residues initiate the folding process, whereas highly stable residues prevent spontaneous unfolding events. The structural relevance score proposed by this thesis is employed to characterize both types of residues. Early folding residues form pivotal secondary structure elements, but their structural relevance is average. In contrast, highly stable residues exhibit significantly increased structural relevance. This implies that residues crucial for the folding process are not relevant for structural integrity and vice versa. The position of early folding residues is preserved over the course of evolution as demonstrated for two ancient regions shared by all aminoacyl-tRNA synthetases. One arrangement of folding initiation sites resembles an ancient and widely distributed structural packing motif and captures how reverberations of the earliest periods of life can still be observed in contemporary protein structures

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Evolutionary-based methods for predicting genotype-phenotype associations in the mammalian genome

    Get PDF
    Phenotypic and genotypic variation between species are the result of millions of experiments performed by nature. Understanding why and how phenotypic complexity arises is a central goal of evolutionary biology. Technological advancements enabling whole genome sequencing have laid the foundation for developing comparative genomics-based tools for inferring genetic elements underlying phenotypic adaptations. The work covered as part of this thesis will develop these tools drawing from principles of convergent evolution, aimed at generating specific functional hypotheses that can help focus experimental efforts. These tools will be relevant for characterizing context-specific functions of cis-regulatory elements as well as protein-coding genes, where a large number lack functional annotation beyond domain homology. Expanding from one-dimensional approaches studying proteins in isolation, we propose to build an integrated co-evolutionary framework that will serve as a powerful tool for protein interaction prediction. In this dissertation, we discuss these ideas through the following three projects. In chapter 1, we perform a genome-wide scan for genes showing convergent rate changes in four subterranean mammals, and study the underlying changes in selective pressure causing these convergent shifts in rate. Using a new variant of our rates-based method, we demonstrate that eye-specific regulatory regions show strong rate accelerations in the subterranean mammals. This study demonstrates the potential of convergent evolution-based tools in the functional annotation of eye-specific genetic elements. In chapter 2, we build a robust method to infer shifts in rate associated with a wide range of evolutionary scenarios. We investigate the statistical underpinnings of our rates-based framework and identify the best performing variant of our method across real and simulated phylogenetic datasets. We distribute these tools to the research community, enabling large scale generation of specific functional hypotheses for regulatory regions. In chapter 3, we propose to construct a powerful framework for protein interaction prediction using integration of proteome-wide co-evolutionary signatures. We systematically benchmark the predictions of our coevolutionary framework using known functional interactions among proteins across various scales. We make the predictions of the framework publicly available, useful for functional annotation of less well-characterized genes

    Multiobjective Optimization of Fuzzy System for Cardiovascular Risk Classification

    Get PDF
    Dado que las enfermedades cardiovasculares (ECV) plantean una preocupación mundial crítica, la identificación de los factores de riesgo asociados sigue siendo un foco de investigación fundamental. Este estudio tiene como objetivo proponer y optimizar un sistema difuso para la clasificación del riesgo cardiovascular (RCV) utilizando un enfoque multiobjetivo, abordando aspectos computacionales como la configuración del sistema difuso, el proceso de optimización, la selección de una solución adecuada a partir del frente de Pareto óptimo, y la interpretabilidad del sistema de lógica difusa después del proceso de optimización. El sistema propuesto utiliza datos, incluida la edad, el peso, la altura, el sexo y la presión arterial sistólica para determinar el riesgo cardiovascular. El modelo difuso se basa en información preliminar de la literatura; por lo tanto, para ajustar el sistema de lógica difusa utilizando un enfoque multiobjetivo, el índice de masa corporal (IMC) se considera como un resultado adicional ya que hay datos disponibles para este índice, y el índice de masa corporal se reconoce como un indicador aproximado del riesgo cardiovascular dada la propensión a sufrir enfermedades cardiovasculares. Estas enfermedades se atribuyen al exceso de tejido adiposo, que puede elevar la presión arterial, los niveles de colesterol y triglicéridos, provocando daño arterial y cardíaco. Al emplear un enfoque multiobjetivo, el estudio pretende obtener un equilibrio entre los dos resultados correspondientes a la clasificación de riesgo cardiovascular y el índice de masa corporal. Para la optimización multiobjetivo se propone un conjunto de experimentos que arrojan como resultado un frente de Pareto óptimo para posteriormente determinar la solución adecuada. Los resultados muestran una adecuada optimización del sistema de lógica difusa, permitiendo la interpretabilidad de los conjuntos difusos luego de realizar el proceso de optimización. De esta manera, este artículo contribuye al avance del uso de técnicas computacionales en el ámbito médico.Since cardiovascular diseases (CVDs) pose a critical global concern, identifying associated risk factors remains a pivotal research focus. This study aims to propose and optimize a fuzzy system for cardiovascular risk (CVR) classification using a multiobjective approach, addressing computational aspects such as the configuration of the fuzzy system, the optimization process, the selection of a suitable solution from the optimal Pareto front, and the interpretability of the fuzzy logic system after the optimization process. The proposed system utilizes data, including age, weight, height, gender, and systolic blood pressure to determine cardiovascular risk. The fuzzy model is based on preliminary information from the literature; therefore, to adjust the fuzzy logic system using a multiobjective approach, the body mass index (BMI) is considered as an additional output as data are available for this index, and body mass index is acknowledged as a proxy for cardiovascular risk given the propensity for these diseases attributed to surplus adipose tissue, which can elevate blood pressure, cholesterol, and triglyceride levels, leading to arterial and cardiac damage. By employing a multiobjective approach, the study aims to obtain a balance between the two outputs corresponding to cardiovascular risk classification and body mass index. For the multiobjective optimization, a set of experiments is proposed that render an optimal Pareto front, as a result, to later determine the appropriate solution. The results show an adequate optimization of the fuzzy logic system, allowing the interpretability of the fuzzy sets after carrying out the optimization process. In this way, this paper contributes to the advancement of the use of computational techniques in the medical domain

    Dynamics of Macrosystems; Proceedings of a Workshop, September 3-7, 1984

    Get PDF
    There is an increasing awareness of the important and persuasive role that instability and random, chaotic motion play in the dynamics of macrosystems. Further research in the field should aim at providing useful tools, and therefore the motivation should come from important questions arising in specific macrosystems. Such systems include biochemical networks, genetic mechanisms, biological communities, neutral networks, cognitive processes and economic structures. This list may seem heterogeneous, but there are similarities between evolution in the different fields. It is not surprising that mathematical methods devised in one field can also be used to describe the dynamics of another. IIASA is attempting to make progress in this direction. With this aim in view this workshop was held at Laxenburg over the period 3-7 September 1984. These Proceedings cover a broad canvas, ranging from specific biological and economic problems to general aspects of dynamical systems and evolutionary theory

    Evolvability and organismal architecture:The blind watchmaker and the reminiscent architect

    Get PDF
    Organisms are constantly faced with the challenge of adapting to new circumstances. In this thesis, I argue that the ability to adapt to new circumstances, “evolvability”, is deeply ingrained in the genetic, developmental, morphological, and physiological architecture of organisms. Using a blend of conceptual research, theoretical modelling, and multidisciplinary studies, I demonstrate how organismal architecture can evolve so that organisms can cope better and better with future environmental challenges. As a first step, I systematically classify the many factors contributing to evolvability. Then I use a simulation approach to show how evolvability-enhancing structures can readily evolve in gene-regulatory networks. This happens via the evolution of "mutational transformers" - structural elements that convert random mutations at the genetic level into adaptation-enhancing mutations at the phenotypic level. In another thesis chapter, I demonstrate that even if selection acts only sporadically, complex adaptations can evolve and persist over long time periods. In other words, complex adaptations do not require constant selection pressure. In an interdisciplinary contribution, I apply biological insights regarding the properties of an evolvability-enhancing mutation structure to the design of algorithms used in Artificial Intelligence. The result is the “Facilitated Mutation” method which enhances the performance of the algorithms in various respects, highlighting the potential for leveraging biological principles in computational sciences. Finally, I embed my research findings in a philosophical context. I emphasise the importance of organismal architecture in retaining evolutionary memories and suggest future research directions to further enhance our understanding of evolvability
    corecore