854 research outputs found
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
Simultaneous identification of specifically interacting paralogs and inter-protein contacts by Direct-Coupling Analysis
Understanding protein-protein interactions is central to our understanding of
almost all complex biological processes. Computational tools exploiting rapidly
growing genomic databases to characterize protein-protein interactions are
urgently needed. Such methods should connect multiple scales from evolutionary
conserved interactions between families of homologous proteins, over the
identification of specifically interacting proteins in the case of multiple
paralogs inside a species, down to the prediction of residues being in physical
contact across interaction interfaces. Statistical inference methods detecting
residue-residue coevolution have recently triggered considerable progress in
using sequence data for quaternary protein structure prediction; they require,
however, large joint alignments of homologous protein pairs known to interact.
The generation of such alignments is a complex computational task on its own;
application of coevolutionary modeling has in turn been restricted to proteins
without paralogs, or to bacterial systems with the corresponding coding genes
being co-localized in operons. Here we show that the Direct-Coupling Analysis
of residue coevolution can be extended to connect the different scales, and
simultaneously to match interacting paralogs, to identify inter-protein
residue-residue contacts and to discriminate interacting from noninteracting
families in a multiprotein system. Our results extend the potential
applications of coevolutionary analysis far beyond cases treatable so far.Comment: Main Text 19 pages Supp. Inf. 16 page
06061 Abstracts Collection -- Theory of Evolutionary Algorithms
From 05.02.06 to 10.02.06, the Dagstuhl Seminar 06061 ``Theory of Evolutionary Algorithms\u27\u27 was held in the International Conference and Research Center (IBFI),
Schloss Dagstuhl.
During the seminar, several participants presented their current
research, and ongoing work and open problems were discussed. Abstracts of
the presentations given during the seminar as well as abstracts of
seminar results and ideas are put together in this paper. The first section
describes the seminar topics and goals in general.
Links to extended abstracts or full papers are provided, if available
Shared Nearest-Neighbor Quantum Game-Based Attribute Reduction with Hierarchical Coevolutionary Spark and Its Application in Consistent Segmentation of Neonatal Cerebral Cortical Surfaces
© 2012 IEEE. The unprecedented increase in data volume has become a severe challenge for conventional patterns of data mining and learning systems tasked with handling big data. The recently introduced Spark platform is a new processing method for big data analysis and related learning systems, which has attracted increasing attention from both the scientific community and industry. In this paper, we propose a shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model. We first present a shared coevolutionary nearest-neighbor hierarchy with self-evolving compensation that considers the features of nearest-neighborhood attribute subsets and calculates the similarity between attribute subsets according to the shared neighbor information of attribute sample points. We then present a novel attribute weight tensor model to generate ranking vectors of attributes and apply them to balance the relative contributions of different neighborhood attribute subsets. To optimize the model, we propose an embedded quantum equilibrium game paradigm (QEGP) to ensure that noisy attributes do not degrade the big data reduction results. A combination of the hierarchical coevolutionary Spark model and an improved MapReduce framework is then constructed that it can better parallelize the SNNQGAR to efficiently determine the preferred reduction solutions of the distributed attribute subsets. The experimental comparisons demonstrate the superior performance of the SNNQGAR, which outperforms most of the state-of-the-art attribute reduction algorithms. Moreover, the results indicate that the SNNQGAR can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces
Multi - objective cooperative neuro - evolution of recurrent neural networks for time series prediction
Cooperative coevolution is an evolutionary computation method which solves a problem by decomposing it into smaller subcomponents. Multi-objective optimization
deals with conflicting objectives and produces multiple optimal solutions instead of a single global optimal solution. In previous work, a multi-objective cooperative co-evolutionary method was introduced for training feedforward neural networks on time series problems. In this paper, the same method is used for training recurrent neural networks. The proposed approach is
tested on time series problems in which the different time-lags represent the different objectives. Multiple pre-processed datasets distinguished by their time-lags are used for training and testing. This results in the discovery of a single neural network that can correctly give predictions for data pre-processed using different
time-lags. The method is tested on several benchmark time
series problems on which it gives a competitive performance in comparison to the methods in the literature
Knowledge management overview of feature selection problem in high-dimensional financial data: Cooperative co-evolution and Map Reduce perspectives
The term big data characterizes the massive amounts of data generation by the advanced technologies in different domains using 4Vs volume, velocity, variety, and veracity-to indicate the amount of data that can only be processed via computationally intensive analysis, the speed of their creation, the different types of data, and their accuracy. High-dimensional financial data, such as time-series and space-Time data, contain a large number of features (variables) while having a small number of samples, which are used to measure various real-Time business situations for financial organizations. Such datasets are normally noisy, and complex correlations may exist between their features, and many domains, including financial, lack the al analytic tools to mine the data for knowledge discovery because of the high-dimensionality. Feature selection is an optimization problem to find a minimal subset of relevant features that maximizes the classification accuracy and reduces the computations. Traditional statistical-based feature selection approaches are not adequate to deal with the curse of dimensionality associated with big data. Cooperative co-evolution, a meta-heuristic algorithm and a divide-And-conquer approach, decomposes high-dimensional problems into smaller sub-problems. Further, MapReduce, a programming model, offers a ready-To-use distributed, scalable, and fault-Tolerant infrastructure for parallelizing the developed algorithm. This article presents a knowledge management overview of evolutionary feature selection approaches, state-of-The-Art cooperative co-evolution and MapReduce-based feature selection techniques, and future research directions
Distributed evolutionary algorithms and their models: A survey of the state-of-the-art
The increasing complexity of real-world optimization problems raises new challenges to evolutionary computation. Responding to these challenges, distributed evolutionary computation has received considerable attention over the past decade. This article provides a comprehensive survey of the state-of-the-art distributed evolutionary algorithms and models, which have been classified into two groups according to their task division mechanism. Population-distributed models are presented with master-slave, island, cellular, hierarchical, and pool architectures, which parallelize an evolution task at population, individual, or operation levels. Dimension-distributed models include coevolution and multi-agent models, which focus on dimension reduction. Insights into the models, such as synchronization, homogeneity, communication, topology, speedup, advantages and disadvantages are also presented and discussed. The study of these models helps guide future development of different and/or improved algorithms. Also highlighted are recent hotspots in this area, including the cloud and MapReduce-based implementations, GPU and CUDA-based implementations, distributed evolutionary multiobjective optimization, and real-world applications. Further, a number of future research directions have been discussed, with a conclusion that the development of distributed evolutionary computation will continue to flourish
Adaptive digital watermarking scheme based on support vector machines and optimized genetic algorithm
Digital watermarking is an effective solution to the problem of copyright protection, thus maintaining the security of digital products in the network. An improved scheme to increase the robustness of embedded information on the basis of discrete cosine transform (DCT) domain is proposed in this study. The embedding process consisted of two main procedures. Firstly, the embedding intensity with support vector machines (SVMs) was adaptively strengthened by training 1600 image blocks which are of different texture and luminance. Secondly, the embedding position with the optimized genetic algorithm (GA) was selected. To optimize GA, the best individual in the first place of each generation directly went into the next generation, and the best individual in the second position participated in the crossover and the mutation process. The transparency reaches 40.5 when GA’s generation number is 200. A case study was conducted on a 256 × 256 standard Lena image with the proposed method. After various attacks (such as cropping, JPEG compression, Gaussian low-pass filtering (3, 0. 5), histogram equalization, and contrast increasing (0.5, 0.6)) on the watermarked image, the extracted watermark was compared with the original one. Results demonstrate that the watermark can be effectively recovered after these attacks. Even though the algorithm is weak against rotation attacks, it provides high quality in imperceptibility and robustness and hence it is a successful candidate for implementing novel image watermarking scheme meeting real timelines
- …