2,561 research outputs found

    Multimodal estimation of distribution algorithms

    Get PDF
    Taking the advantage of estimation of distribution algorithms (EDAs) in preserving high diversity, this paper proposes a multimodal EDA. Integrated with clustering strategies for crowding and speciation, two versions of this algorithm are developed, which operate at the niche level. Then these two algorithms are equipped with three distinctive techniques: 1) a dynamic cluster sizing strategy; 2) an alternative utilization of Gaussian and Cauchy distributions to generate offspring; and 3) an adaptive local search. The dynamic cluster sizing affords a potential balance between exploration and exploitation and reduces the sensitivity to the cluster size in the niching methods. Taking advantages of Gaussian and Cauchy distributions, we generate the offspring at the niche level through alternatively using these two distributions. Such utilization can also potentially offer a balance between exploration and exploitation. Further, solution accuracy is enhanced through a new local search scheme probabilistically conducted around seeds of niches with probabilities determined self-adaptively according to fitness values of these seeds. Extensive experiments conducted on 20 benchmark multimodal problems confirm that both algorithms can achieve competitive performance compared with several state-of-the-art multimodal algorithms, which is supported by nonparametric tests. Especially, the proposed algorithms are very promising for complex problems with many local optima

    Soft Computing Techiniques for the Protein Folding Problem on High Performance Computing Architectures

    Get PDF
    The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.This work is jointly supported by the FundaciónSéneca (Agencia Regional de Ciencia y Tecnología, Región de Murcia) under grants 15290/PI/2010 and 18946/JLI/13, by the Spanish MEC and European Commission FEDER under grant with reference TEC2012-37945-C02-02 and TIN2012-31345, by the Nils Coordinated Mobility under grant 012-ABEL-CM-2014A, in part financed by the European Regional Development Fund (ERDF). We also thank NVIDIA for hardware donation within UCAM GPU educational and research centers.Ingeniería, Industria y Construcció

    Fast learning optimized prediction methodology for protein secondary structure prediction, relative solvent accessibility prediction and phosphorylation prediction

    Get PDF
    Computational methods are rapidly gaining importance in the field of structural biology, mostly due to the explosive progress in genome sequencing projects and the large disparity between the number of sequences and the number of structures. There has been an exponential growth in the number of available protein sequences and a slower growth in the number of structures. There is therefore an urgent need to develop computed structures and identify the functions of these sequences. Developing methods that will satisfy these needs both efficiently and accurately is of paramount importance for advances in many biomedical fields, for a better basic understanding of aberrant states of stress and disease, including drug discovery and discovery of biomarkers. Several aspects of secondary structure predictions and other protein structure-related predictions are investigated using different types of information such as data obtained from knowledge-based potentials derived from amino acids in protein sequences, physicochemical properties of amino acids and propensities of amino acids to appear at the ends of secondary structures. Investigating the performance of these secondary structure predictions by type of amino acid highlights some interesting aspects relating to the influences of the individual amino acid types on formation of secondary structures and points toward ways to make further gains. Other research areas include Relative Solvent Accessibility (RSA) predictions and predictions of phosphorylation sites, which is one of the Post-Translational Modification (PTM) sites in proteins. Protein secondary structures and other features of proteins are predicted efficiently, reliably, less expensively and more accurately. A novel method called Fast Learning Optimized PREDiction (FLOPRED) Methodology is proposed for predicting protein secondary structures and other features, using knowledge-based potentials, a Neural Network based Extreme Learning Machine (ELM) and advanced Particle Swarm Optimization (PSO) techniques that yield better and faster convergence to produce more accurate results. These techniques yield superior classification of secondary structures, with a training accuracy of 93.33% and a testing accuracy of 92.24% with a standard deviation of 0.48% obtained for a small group of 84 proteins. We have a Matthew\u27s correlation-coefficient ranging between 80.58% and 84.30% for these secondary structures. Accuracies for individual amino acids range between 83% and 92% with an average standard deviation between 0.3% and 2.9% for the 20 amino acids. On a larger set of 415 proteins, we obtain a testing accuracy of 86.5% with a standard deviation of 1.38%. These results are significantly higher than those found in the literature. Prediction of protein secondary structure based on amino acid sequence is a common technique used to predict its 3-D structure. Additional information such as the biophysical properties of the amino acids can help improve the results of secondary structure prediction. A database of protein physicochemical properties is used as features to encode protein sequences and this data is used for secondary structure prediction using FLOPRED. Preliminary studies using a Genetic Algorithm (GA) for feature selection, Principal Component Analysis (PCA) for feature reduction and FLOPRED for classification give promising results. Some amino acids appear more often at the ends of secondary structures than others. A preliminary study has indicated that secondary structure accuracy can be improved as much as 6% by including these effects for those residues present at the ends of alpha-helix, beta-strand and coil. A study on RSA prediction using ELM shows large gains in processing speed compared to using support vector machines for classification. This indicates that ELM yields a distinct advantage in terms of processing speed and performance for RSA. Additional gains in accuracies are possible when the more advanced FLOPRED algorithm and PSO optimization are implemented. Phosphorylation is a post-translational modification on proteins often controls and regulates their activities. It is an important mechanism for regulation. Phosphorylated sites are known to be present often in intrinsically disordered regions of proteins lacking unique tertiary structures, and thus less information is available about the structures of phosphorylated sites. It is important to be able to computationally predict phosphorylation sites in protein sequences obtained from mass-scale sequencing of genomes. Phosphorylation sites may aid in the determination of the functions of a protein and to better understanding the mechanisms of protein functions in healthy and diseased states. FLOPRED is used to model and predict experimentally determined phosphorylation sites in protein sequences. Our new PSO optimization included in FLOPRED enable the prediction of phosphorylation sites with higher accuracy and with better generalization. Our preliminary studies on 984 sequences demonstrate that this model can predict phosphorylation sites with a training accuracy of 92.53% , a testing accuracy 91.42% and Matthew\u27s correlation coefficient of 83.9%. In summary, secondary structure prediction, Relative Solvent Accessibility and phosphorylation site prediction have been carried out on multiple sets of data, encoded with a variety of information drawn from proteins and the physicochemical properties of their constituent amino acids. Improved and efficient algorithms called S-ELM and FLOPRED, which are based on Neural Networks and Particle Swarm Optimization are used for classifying and predicting protein sequences. Analysis of the results of these studies provide new and interesting insights into the influence of amino acids on secondary structure prediction. S-ELM and FLOPRED have also proven to be robust and efficient for predicting relative solvent accessibility of proteins and phosphorylation sites. These studies show that our method is robust and resilient and can be applied for a variety of purposes. It can be expected to yield higher classification accuracy and better generalization performance compared to previous methods

    How Can Bee Colony Algorithm Serve Medicine?

    Get PDF
    Healthcare professionals usually should make complex decisions with far reaching consequences and associated risks in health care fields. As it was demonstrated in other industries, the ability to drill down into pertinent data to explore knowledge behind the data can greatly facilitate superior, informed decisions to ensue the facts. Nature has always inspired researchers to develop models of solving the problems. Bee colony algorithm (BCA), based on the self-organized behavior of social insects is one of the most popular member of the family of population oriented, nature inspired meta-heuristic swarm intelligence method which has been proved its superiority over some other nature inspired algorithms. The objective of this model was to identify valid novel, potentially useful, and understandable correlations and patterns in existing data. This review employs a thematic analysis of online series of academic papers to outline BCA in medical hive, reducing the response and computational time and optimizing the problems. To illustrate the benefits of this model, the cases of disease diagnose system are presented
    • …
    corecore