518 research outputs found

    A Survey on Soft Subspace Clustering

    Full text link
    Subspace clustering (SC) is a promising clustering technology to identify clusters based on their associations with subspaces in high dimensional spaces. SC can be classified into hard subspace clustering (HSC) and soft subspace clustering (SSC). While HSC algorithms have been extensively studied and well accepted by the scientific community, SSC algorithms are relatively new but gaining more attention in recent years due to better adaptability. In the paper, a comprehensive survey on existing SSC algorithms and the recent development are presented. The SSC algorithms are classified systematically into three main categories, namely, conventional SSC (CSSC), independent SSC (ISSC) and extended SSC (XSSC). The characteristics of these algorithms are highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201

    A cellular coevolutionary algorithm for image segmentation

    Full text link

    Attribute Equilibrium Dominance Reduction Accelerator (DCCAEDR) Based on Distributed Coevolutionary Cloud and Its Application in Medical Records

    Full text link
    © 2013 IEEE. Aimed at the tremendous challenge of attribute reduction for big data mining and knowledge discovery, we propose a new attribute equilibrium dominance reduction accelerator (DCCAEDR) based on the distributed coevolutionary cloud model. First, the framework of N-populations distributed coevolutionary MapReduce model is designed to divide the entire population into N subpopulations, sharing the reward of different subpopulations' solutions under a MapReduce cloud mechanism. Because the adaptive balancing between exploration and exploitation can be achieved in a better way, the reduction performance is guaranteed to be the same as those using the whole independent data set. Second, a novel Nash equilibrium dominance strategy of elitists under the N bounded rationality regions is adopted to assist the subpopulations necessary to attain the stable status of Nash equilibrium dominance. This further enhances the accelerator's robustness against complex noise on big data. Third, the approximation parallelism mechanism based on MapReduce is constructed to implement rule reduction by accelerating the computation of attribute equivalence classes. Consequently, the entire attribute reduction set with the equilibrium dominance solution can be achieved. Extensive simulation results have been used to illustrate the effectiveness and robustness of the proposed DCCAEDR accelerator for attribute reduction on big data. Furthermore, the DCCAEDR is applied to solve attribute reduction for traditional Chinese medical records and to segment cortical surfaces of the neonatal brain 3-D-MRI records, and the DCCAEDR shows the superior competitive results, when compared with the representative algorithms

    Evolutionary Algorithms

    Full text link
    Evolutionary algorithms (EAs) are population-based metaheuristics, originally inspired by aspects of natural evolution. Modern varieties incorporate a broad mixture of search mechanisms, and tend to blend inspiration from nature with pragmatic engineering concerns; however, all EAs essentially operate by maintaining a population of potential solutions and in some way artificially 'evolving' that population over time. Particularly well-known categories of EAs include genetic algorithms (GAs), Genetic Programming (GP), and Evolution Strategies (ES). EAs have proven very successful in practical applications, particularly those requiring solutions to combinatorial problems. EAs are highly flexible and can be configured to address any optimization task, without the requirements for reformulation and/or simplification that would be needed for other techniques. However, this flexibility goes hand in hand with a cost: the tailoring of an EA's configuration and parameters, so as to provide robust performance for a given class of tasks, is often a complex and time-consuming process. This tailoring process is one of the many ongoing research areas associated with EAs.Comment: To appear in R. Marti, P. Pardalos, and M. Resende, eds., Handbook of Heuristics, Springe

    Information Theory in Molecular Evolution: From Models to Structures and Dynamics

    Get PDF
    This Special Issue collects novel contributions from scientists in the interdisciplinary field of biomolecular evolution. Works listed here use information theoretical concepts as a core but are tightly integrated with the study of molecular processes. Applications include the analysis of phylogenetic signals to elucidate biomolecular structure and function, the study and quantification of structural dynamics and allostery, as well as models of molecular interaction specificity inspired by evolutionary cues

    Cloud computing resource scheduling and a survey of its evolutionary approaches

    Get PDF
    A disruptive technology fundamentally transforming the way that computing services are delivered, cloud computing offers information and communication technology users a new dimension of convenience of resources, as services via the Internet. Because cloud provides a finite pool of virtualized on-demand resources, optimally scheduling them has become an essential and rewarding topic, where a trend of using Evolutionary Computation (EC) algorithms is emerging rapidly. Through analyzing the cloud computing architecture, this survey first presents taxonomy at two levels of scheduling cloud resources. It then paints a landscape of the scheduling problem and solutions. According to the taxonomy, a comprehensive survey of state-of-the-art approaches is presented systematically. Looking forward, challenges and potential future research directions are investigated and invited, including real-time scheduling, adaptive dynamic scheduling, large-scale scheduling, multiobjective scheduling, and distributed and parallel scheduling. At the dawn of Industry 4.0, cloud computing scheduling for cyber-physical integration with the presence of big data is also discussed. Research in this area is only in its infancy, but with the rapid fusion of information and data technology, more exciting and agenda-setting topics are likely to emerge on the horizon

    Knowledge-based identification of functional domains in proteins

    Get PDF
    The characterization of proteins and enzymes is traditionally organised according to the sequence-structure-function paradigm. The investigation of the inter-relationships between these three properties has motivated the development of several experimental and computational techniques, that have made available an unprecedented amount of sequence and structural data. The interest in developing comparative methods for rationalizing such copious information has, of course, grown in parallel. Regarding the structure-function relationship, for instance, the availability of experimentally resolved protein structures and of computer simulations have improved our understanding of the role of proteins' internal dynamics in assisting their functional rearrangements and activity. Several approaches are currently available for elucidating and comparing proteins' internal dynamics. These can capture the relevant collective degrees of freedom that recapitulate the main conformational changes. These collective coordinates have the potential to unveil remote evolutionary relationships between proteins, that are otherwise not easily accessible from purely sequence- or structure-based investigations. Starting from this premise, in the first chapter of this thesis I will present a novel and general computational method that can detect large-scale dynamical correlations in proteins by comparing different representative conformers. This is accomplished by applying dimensionality-reduction techniques to inter-amino acid distance fluctuation matrices. As a result, an optimal quasi-rigid domain decomposition of the protein or macromolecular assembly of interest is identified, and this facilitates the functionally-oriented interpretation of their internal dynamics. Building on this approach, in the second chapter I will discuss its systematic application to a class of membrane proteins of paramount biochemical interest, namely the class A G protein-coupled receptors. The comparative analysis of their internal dynamics, as encoded by the quasi-rigid domains, allowed us to identify recurrent patterns in the large-scale dynamics of these receptors. This, in turn, allowed us to single out a number of key functional sites. These were, for the most part, previously known -- a fact that at the same time validates the method, and gives confidence for the viability of the other, novel sites. Finally, for the last part of the thesis, I focussed on the sequence-structure relationship. In particular, I considered the problem of inferring structural properties of proteins from the analysis of large multiple sequence alignments of homologous sequences. For this purpose, I recasted the strategies developed for the dynamical features extraction in order to identify compact groups of coevolving residues, based only on the knowledge of amino acid variability in aligned primary sequences. Throughout the thesis, many methodological techniques have been taken into considerations, mainly based on concepts from graph theory and statistical data analysis (clustering). All these topics are explained in the methodological sections of each chapter

    Personalised information modelling technologies for personalised medicine

    Get PDF
    Personalised modelling offers a new and effective approach for the study in pattern recognition and knowledge discovery, especially for biomedical applications. The created models are more useful and informative for analysing and evaluating an individual data object for a given problem. Such models are also expected to achieve a higher degree of accuracy of prediction of outcome or classification than conventional systems and methodologies. Motivated by the concept of personalised medicine and utilising transductive reasoning, personalised modelling was recently proposed as a new method for knowledge discovery in biomedical applications. Personalised modelling aims to create a unique computational diagnostic or prognostic model for an individual. Here we introduce an integrated method for personalised modelling that applies global optimisation of variables (features) and an appropriate size of neighbourhood to create an accurate personalised model for an individual. This method creates an integrated computational system that combines different information processing techniques, applied at different stages of data analysis, e.g. feature selection, classification, discovering the interaction of genes, outcome prediction, personalised profiling and visualisation, etc. It allows for adaptation, monitoring and improvement of an individual’s model and leads to improved accuracy and unique personalised profiling that could be used for personalised treatment and personalised drug design
    corecore