236 research outputs found
Recommended from our members
Sibe: a computation tool to apply protein sequence statistics to predict folding and design in silico.
BACKGROUND: Evolutionary information contained in the amino acid sequences of proteins specifies the biological function and fold, but exactly what information contained in the protein sequence drives both of these processes? Considerable progress has been made to answer this fundamental question, but it remains challenging to explore the potential space of cooperative interactions between amino acids. Statistical analysis plays a significant role in studying such interactions and its use has expanded in recent years to studies ranging from coevolution-guided rational protein design to protein folding in silico. RESULTS: Here we describe a computational tool named Sibe for use in studies of protein sequence, folding, and design using evolutionary coupling between amino acids as a driving factor. In this study, Sibe is used to identify positionally conserved couplings between pairwise amino acids and aid rational protein design. In this process, pairwise couplings are filtered according to the relative entropy computed from the positional conservations and grouped into several 'blocks', which could contribute to driving protein folding and design. A human β2-adrenergic receptor (β2AR) was used to demonstrate that those 'blocks' contribute the rational design for specifying functional residues. Sibe also provides folding modules based on both the positionally conserved couplings and well-established statistical potentials for simulating protein folding in silico and predicting tertiary structure. Our results show that statistically inferences of basic evolutionary principles, such as conservations and coupled-mutations, can be used to rapidly design a diverse set of proteins and study protein folding. CONCLUSIONS: The developed software Sibe provides a computational tool for systematical analysis from protein primary to its tertiary structure using the evolutionary couplings as a driving factor. Sibe, written in C++, accounts for compatibility with the 'big data' era in biological science, and it primarily focuses on protein sequence analysis, but it is also applicable to extend to other modeling and predictions of experimental measurements
Comprehensibility, Overfitting and Co-Evolution in Genetic Programming for Technical Trading Rules
This thesis presents Genetic Programming methodologies to find successful and understandable technical trading rules for financial markets. The methods when applied to the S&P500 consistently beat the buy-and-hold strategy over a 12-year period, even when considering transaction costs. Some of the methods described discover rules that beat the S&P500 with 99% significance. The work describes the use of a complexity-penalizing factor to avoid overfitting and improve comprehensibility of the rules produced by GPs. The effect of this factor on the returns for this domain area is studied and the results indicated that it increased the predictive ability of the rules. A restricted set of operators and domain knowledge were used to improve comprehensibility. In particular, arithmetic operators were eliminated and a number of technical indicators in addition to the widely used moving averages, such as trend lines and local maxima and minima were added. A new evaluation function that tests for consistency of returns in addition to total returns is introduced. Different cooperative coevolutionary genetic programming strategies for improving returns are studied and the results analyzed. We find that paired collaborator coevolution has the best results
Cooperative coevolution of artificial neural network ensembles for pattern classification
This paper presents a cooperative coevolutive approach for designing neural network ensembles. Cooperative coevolution is a recent paradigm in evolutionary computation that allows the effective modeling of cooperative environments. Although theoretically, a single neural network with a sufficient number of neurons in the hidden layer would suffice to solve any problem, in practice many real-world problems are too hard to construct the appropriate network that solve them. In such problems, neural network ensembles are a successful alternative. Nevertheless, the design of neural network ensembles is a complex task. In this paper, we propose a general framework for designing neural network ensembles by means of cooperative coevolution. The proposed model has two main objectives: first, the improvement of the combination of the trained individual networks; second, the cooperative evolution of such networks, encouraging collaboration among them, instead of a separate training of each network. In order to favor the cooperation of the networks, each network is evaluated throughout the evolutionary process using a multiobjective method. For each network, different objectives are defined, considering not only its performance in the given problem, but also its cooperation with the rest of the networks. In addition, a population of ensembles is evolved, improving the combination of networks and obtaining subsets of networks to form ensembles that perform better than the combination of all the evolved networks. The proposed model is applied to ten real-world classification problems of a very different nature from the UCI machine learning repository and proben1 benchmark set. In all of them the performance of the model is better than the performance of standard ensembles in terms of generalization error. Moreover, the size of the obtained ensembles is also smaller
The size of the immune repertoire of bacteria
Some bacteria and archaea possess an immune system, based on the CRISPR-Cas
mechanism, that confers adaptive immunity against phage. In such species,
individual bacteria maintain a "cassette" of viral DNA elements called spacers
as a memory of past infections. The typical cassette contains a few dozen
spacers. Given that bacteria can have very large genomes, and since having more
spacers should confer a better memory, it is puzzling that so little genetic
space would be devoted by bacteria to their adaptive immune system. Here, we
identify a fundamental trade-off between the size of the bacterial immune
repertoire and effectiveness of response to a given threat, and show how this
tradeoff imposes a limit on the optimal size of the CRISPR cassette.Comment: 9 pages, 5 figure
An exploration of evolutionary computation applied to frequency modulation audio synthesis parameter optimisation
With the ever-increasing complexity of sound synthesisers, there is a growing demand for automated parameter estimation and sound space navigation techniques. This thesis explores the potential for evolutionary computation to automatically map known sound qualities onto the parameters of frequency modulation synthesis. Within this exploration are original contributions in the domain of synthesis parameter estimation and, within the developed system, evolutionary computation, in the form of the evolutionary algorithms that drive the underlying optimisation process. Based upon the requirement for the parameter estimation system to deliver multiple search space solutions, existing evolutionary algorithmic architectures are augmented to enable niching, while maintaining the strengths of the original algorithms. Two novel evolutionary algorithms are proposed in which cluster analysis is used to identify and maintain species within the evolving populations. A conventional evolution strategy and cooperative coevolution strategy are defined, with cluster-orientated operators that enable the simultaneous optimisation of multiple search space solutions at distinct optima. A test methodology is developed that enables components of the synthesis matching problem to be identified and isolated, enabling the performance of different optimisation techniques to be compared quantitatively. A system is consequently developed that evolves sound matches using conventional frequency modulation synthesis models, and the effectiveness of different evolutionary algorithms is assessed and compared in application to both static and timevarying sound matching problems. Performance of the system is then evaluated by interview with expert listeners. The thesis is closed with a reflection on the algorithms and systems which have been developed, discussing possibilities for the future of automated synthesis parameter estimation techniques, and how they might be employed
Insights from Coarse-Grained Gō Models for Protein Folding and Dynamics
Exploring the landscape of large scale conformational changes such as protein folding at atomistic detail poses a considerable computational challenge. Coarse-grained representations of the peptide chain have therefore been developed and over the last decade have proved extremely valuable. These include topology-based Gō models, which constitute a smooth and funnel-like approximation to the folding landscape. We review the many variations of the Gō model that have been employed to yield insight into folding mechanisms. Their success has been interpreted as a consequence of the dominant role of the native topology in folding. The role of local contact density in determining protein dynamics is also discussed and is used to explain the ability of Gō-like models to capture sequence effects in folding and elucidate conformational transitions
- …