3,670 research outputs found

    Evolve the Model Universe of a System Universe

    Full text link
    Uncertain, unpredictable, real time, and lifelong evolution causes operational failures in intelligent software systems, leading to significant damages, safety and security hazards, and tragedies. To fully unleash the potential of such systems and facilitate their wider adoption, ensuring the trustworthiness of their decision making under uncertainty is the prime challenge. To overcome this challenge, an intelligent software system and its operating environment should be continuously monitored, tested, and refined during its lifetime operation. Existing technologies, such as digital twins, can enable continuous synchronisation with such systems to reflect their most updated states. Such representations are often in the form of prior knowledge based and machine learning models, together called model universe. In this paper, we present our vision of combining techniques from software engineering, evolutionary computation, and machine learning to support the model universe evolution

    Disentangling evolutionary signals: conservation, specificity determining positions and coevolution. Implication for catalytic residue prediction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>A large panel of methods exists that aim to identify residues with critical impact on protein function based on evolutionary signals, sequence and structure information. However, it is not clear to what extent these different methods overlap, and if any of the methods have higher predictive potential compared to others when it comes to, in particular, the identification of catalytic residues (CR) in proteins. Using a large set of enzymatic protein families and measures based on different evolutionary signals, we sought to break up the different components of the information content within a multiple sequence alignment to investigate their predictive potential and degree of overlap.</p> <p>Results</p> <p>Our results demonstrate that the different methods included in the benchmark in general can be divided into three groups with a limited mutual overlap. One group containing real-value Evolutionary Trace (rvET) methods and conservation, another containing mutual information (MI) methods, and the last containing methods designed explicitly for the identification of specificity determining positions (SDPs): integer-value Evolutionary Trace (ivET), SDPfox, and XDET. In terms of prediction of CR, we find using a proximity score integrating structural information (as the sum of the scores of residues located within a given distance of the residue in question) that only the methods from the first two groups displayed a reliable performance. Next, we investigated to what degree proximity scores for conservation, rvET and cumulative MI (cMI) provide complementary information capable of improving the performance for CR identification. We found that integrating conservation with proximity scores for rvET and cMI achieved the highest performance. The proximity conservation score contained no complementary information when integrated with proximity rvET. Moreover, the signal from rvET provided only a limited gain in predictive performance when integrated with mutual information and conservation proximity scores. Combined, these observations demonstrate that the rvET and cMI scores add complementary information to the prediction system.</p> <p>Conclusions</p> <p>This work contributes to the understanding of the different signals of evolution and also shows that it is possible to improve the detection of catalytic residues by integrating structural and higher order sequence evolutionary information with sequence conservation.</p

    In silico identification of functional divergence between the multiple groEL gene paralogs in Chlamydiae

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Heat-shock proteins are specialized molecules performing different and essential roles in the cell including protein degradation, folding and trafficking. GroEL is a 60 Kda heat-shock protein ubiquitous in bacteria and has been regarded as an important molecule implicated in chronic inflammatory processes caused by <it>Chlamydiae </it>infections. GroEL in <it>Chlamydiae </it>became duplicated at the origin of the <it>Chlamydiae </it>lineage presenting three distinct molecular chaperones, namely the original protein GroEL1 (Ct110), and its paralogous proteins GroEL2 (Ct604) and GroEL3 (Ct755). These chaperones present differential and independent expressions during the different stages of <it>Chlamydiae </it>infections and have been suggested to present differential physiological and regulatory roles.</p> <p>Results</p> <p>In this comprehensive <it>in silico </it>study we show that GroEL protein paralogs have diverged functionally after the different gene duplication events and that this divergence has occurred mainly between GroEL3 and GroEL1. GroEL2 presents an intermediate functional divergence pattern from GroEL1. Our results point to the different protein-protein interaction patterns between GroEL paralogs and known GroEL protein clients supporting their functional divergence after <it>groEL </it>gene duplication. Analysis of selective constraints identifies periods of adaptive evolution after gene duplication that led to the fixation of amino acid replacements in GroEL protein domains involved in the interaction with GroEL protein clients.</p> <p>Conclusion</p> <p>We demonstrate that GroEL protein copies in <it>Chlamydiae </it>species have diverged functionally after the gene duplication events. We also show that functional divergence has occurred in important functional regions of these GroEL proteins and that very probably have affected the ancestral GroEL regulatory role and protein-protein interaction patterns with GroEL client proteins. Most of the amino acid replacements that have affected interaction with protein clients and that were responsible for the functional divergence between GroEL paralogs were fixed by adaptive evolution after the <it>groEL </it>gene duplication events.</p

    On Design Mining: Coevolution and Surrogate Models

    Get PDF
    © 2017 Massachusetts Institute of Technology. Published under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) license. Design mining is the use of computational intelligence techniques to iteratively search and model the attribute space of physical objects evaluated directly through rapid prototyping to meet given objectives. It enables the exploitation of novel materials and processes without formal models or complex simulation. In this article, we focus upon the coevolutionary nature of the design process when it is decomposed into concurrent sub-design-threads due to the overall complexity of the task. Using an abstract, tunable model of coevolution, we consider strategies to sample subthread designs for whole-system testing and how best to construct and use surrogate models within the coevolutionary scenario. Drawing on our findings, we then describe the effective design of an array of six heterogeneous vertical-axis wind turbines

    An exploration of evolutionary computation applied to frequency modulation audio synthesis parameter optimisation

    Get PDF
    With the ever-increasing complexity of sound synthesisers, there is a growing demand for automated parameter estimation and sound space navigation techniques. This thesis explores the potential for evolutionary computation to automatically map known sound qualities onto the parameters of frequency modulation synthesis. Within this exploration are original contributions in the domain of synthesis parameter estimation and, within the developed system, evolutionary computation, in the form of the evolutionary algorithms that drive the underlying optimisation process. Based upon the requirement for the parameter estimation system to deliver multiple search space solutions, existing evolutionary algorithmic architectures are augmented to enable niching, while maintaining the strengths of the original algorithms. Two novel evolutionary algorithms are proposed in which cluster analysis is used to identify and maintain species within the evolving populations. A conventional evolution strategy and cooperative coevolution strategy are defined, with cluster-orientated operators that enable the simultaneous optimisation of multiple search space solutions at distinct optima. A test methodology is developed that enables components of the synthesis matching problem to be identified and isolated, enabling the performance of different optimisation techniques to be compared quantitatively. A system is consequently developed that evolves sound matches using conventional frequency modulation synthesis models, and the effectiveness of different evolutionary algorithms is assessed and compared in application to both static and timevarying sound matching problems. Performance of the system is then evaluated by interview with expert listeners. The thesis is closed with a reflection on the algorithms and systems which have been developed, discussing possibilities for the future of automated synthesis parameter estimation techniques, and how they might be employed

    High throughput prediction of inter-protein coevolution

    Get PDF
    Inter-protein co-evolution analysis can reveal in/direct functional or physical protein interactions. Inter-protein co-evolutionary analysis compares the correlation of evolutionary changes between residues on aligned orthologous sequences. On the other hand, modern methods used in experimental cell biological research to screen for protein-protein interaction, often based on mass spectrometry, often lead to identification of large amount of possible interacting proteins. If automatized, inter-protein co-evolution analysis can serve as a valuable step in refining the results, typically containing hundreds of hits, for further experiments. Manual retrieval of tens of orthologous sequences, alignment and phylogenetic tree preparations of such amounts of data is insufficient. The aim of this thesis is to create an assembly of scripts that automatize high-throughput inter-protein co-evolution analysis. Scripts were written in Python language. Scripts are using API client interface to access online databases with sequences of input protein identifiers. Through matched identifiers, over 85 representative orthologous sequences from vertebrate species are retrieved from OrthoDB orthologues database. Scripts align these sequences with PRANK MSA algorithm and create corresponding phylogenetic tree. All protein pairs are structured for multicore computation with CAPS programme on CSC supercomputer. Multiple CAPS outputs are abstracted into comprehensive form for comparison of relative co-adaptive co-evolution between proposed protein pairs. In this work, I have developed automatization for a protein-interactome screen done by proximity labelling of B cell receptor and plasma membrane associated proteins under activating or non-activating conditions. Applying high-throughput co-evolutionary analysis to this data provides a completely new approach to identify new players in B cell activation, critical for autoimmunity, hypo-immunity or cancer. Results showed unsatisfying performance of CAPS, explanation and alternatives were given

    Turing learning: : A metric-free approach to inferring behavior and its application to swarms

    Get PDF
    We propose Turing Learning, a novel system identification method for inferring the behavior of natural or artificial systems. Turing Learning simultaneously optimizes two populations of computer programs, one representing models of the behavior of the system under investigation, and the other representing classifiers. By observing the behavior of the system as well as the behaviors produced by the models, two sets of data samples are obtained. The classifiers are rewarded for discriminating between these two sets, that is, for correctly categorizing data samples as either genuine or counterfeit. Conversely, the models are rewarded for 'tricking' the classifiers into categorizing their data samples as genuine. Unlike other methods for system identification, Turing Learning does not require predefined metrics to quantify the difference between the system and its models. We present two case studies with swarms of simulated robots and prove that the underlying behaviors cannot be inferred by a metric-based system identification method. By contrast, Turing Learning infers the behaviors with high accuracy. It also produces a useful by-product - the classifiers - that can be used to detect abnormal behavior in the swarm. Moreover, we show that Turing Learning also successfully infers the behavior of physical robot swarms. The results show that collective behaviors can be directly inferred from motion trajectories of individuals in the swarm, which may have significant implications for the study of animal collectives. Furthermore, Turing Learning could prove useful whenever a behavior is not easily characterizable using metrics, making it suitable for a wide range of applications.Comment: camera-ready versio

    Evolutionary History of the Photolyase/Cryptochrome Superfamily in Eukaryotes

    Get PDF
    Background Photolyases and cryptochromes are evolutionarily related flavoproteins, which however perform distinct physiological functions. Photolyases (PHR) are evolutionarily ancient enzymes. They are activated by light and repair DNA damage caused by UV radiation. Although cryptochromes share structural similarity with DNA photolyases, they lack DNA repair activity. Cryptochrome (CRY) is one of the key elements of the circadian system in animals. In plants, CRY acts as a blue light receptor to entrain circadian rhythms, and mediates a variety of light responses, such as the regulation of flowering and seedling growth. Results We performed a comprehensive evolutionary analysis of the CRY/PHR superfamily. The superfamily consists of 7 major subfamilies: CPD class I and CPD class II photolyases, (6-4) photolyases, CRY-DASH, plant PHR2, plant CRY and animal CRY. Although the whole superfamily evolved primarily under strong purifying selection (average omega = 0.0168), some subfamilies did experience strong episodic positive selection during their evolution. Photolyases were lost in higher animals that suggests natural selection apparently became weaker in the late stage of evolutionary history. The evolutionary time estimates suggested that plant and animal CRYs evolved in the Neoproterozoic Era (similar to 1000-541 Mya), which might be a result of adaptation to the major climate and global light regime changes occurred in that period of the Earth's geological history.published_or_final_versio
    • …
    corecore