693 research outputs found
A Survey on Soft Subspace Clustering
Subspace clustering (SC) is a promising clustering technology to identify
clusters based on their associations with subspaces in high dimensional spaces.
SC can be classified into hard subspace clustering (HSC) and soft subspace
clustering (SSC). While HSC algorithms have been extensively studied and well
accepted by the scientific community, SSC algorithms are relatively new but
gaining more attention in recent years due to better adaptability. In the
paper, a comprehensive survey on existing SSC algorithms and the recent
development are presented. The SSC algorithms are classified systematically
into three main categories, namely, conventional SSC (CSSC), independent SSC
(ISSC) and extended SSC (XSSC). The characteristics of these algorithms are
highlighted and the potential future development of SSC is also discussed.Comment: This paper has been published in Information Sciences Journal in 201
Coevolutionary GA with schema extraction by machine learning techniques and its application to knapsack problems
The authors introduce a novel coevolutionary genetic algorithm with schema extraction by machine learning techniques. Our CGA consists of two GA populations: the first GA (H-GA) searches for the solutions in the given problems and the second GA (P-GA) searches for effective schemata of the H-GA. We aim to improve the search ability of our CGA by extracting more efficiently useful schemata from the H-GA population, and then incorporating those extracted schemata in a natural manner into the P-GA. Several computational simulations on multidimensional knapsack problems confirm the effectiveness of the proposed method</p
Using Datamining Techniques to Help Metaheuristics: A Short Survey
International audienceHybridizing metaheuristic approaches becomes a common way to improve the efficiency of optimization methods. Many hybridizations deal with the combination of several optimization methods. In this paper we are interested in another type of hybridization, where datamining approaches are combined within an optimization process. Hence, we propose to study the interest of combining metaheuristics and datamining through a short survey that enumerates the different opportunities of such combinations based on literature examples
Attribute Equilibrium Dominance Reduction Accelerator (DCCAEDR) Based on Distributed Coevolutionary Cloud and Its Application in Medical Records
© 2013 IEEE. Aimed at the tremendous challenge of attribute reduction for big data mining and knowledge discovery, we propose a new attribute equilibrium dominance reduction accelerator (DCCAEDR) based on the distributed coevolutionary cloud model. First, the framework of N-populations distributed coevolutionary MapReduce model is designed to divide the entire population into N subpopulations, sharing the reward of different subpopulations' solutions under a MapReduce cloud mechanism. Because the adaptive balancing between exploration and exploitation can be achieved in a better way, the reduction performance is guaranteed to be the same as those using the whole independent data set. Second, a novel Nash equilibrium dominance strategy of elitists under the N bounded rationality regions is adopted to assist the subpopulations necessary to attain the stable status of Nash equilibrium dominance. This further enhances the accelerator's robustness against complex noise on big data. Third, the approximation parallelism mechanism based on MapReduce is constructed to implement rule reduction by accelerating the computation of attribute equivalence classes. Consequently, the entire attribute reduction set with the equilibrium dominance solution can be achieved. Extensive simulation results have been used to illustrate the effectiveness and robustness of the proposed DCCAEDR accelerator for attribute reduction on big data. Furthermore, the DCCAEDR is applied to solve attribute reduction for traditional Chinese medical records and to segment cortical surfaces of the neonatal brain 3-D-MRI records, and the DCCAEDR shows the superior competitive results, when compared with the representative algorithms
Intelligent Financial Fraud Detection Practices: An Investigation
Financial fraud is an issue with far reaching consequences in the finance
industry, government, corporate sectors, and for ordinary consumers. Increasing
dependence on new technologies such as cloud and mobile computing in recent
years has compounded the problem. Traditional methods of detection involve
extensive use of auditing, where a trained individual manually observes reports
or transactions in an attempt to discover fraudulent behaviour. This method is
not only time consuming, expensive and inaccurate, but in the age of big data
it is also impractical. Not surprisingly, financial institutions have turned to
automated processes using statistical and computational methods. This paper
presents a comprehensive investigation on financial fraud detection practices
using such data mining methods, with a particular focus on computational
intelligence-based techniques. Classification of the practices based on key
aspects such as detection algorithm used, fraud type investigated, and success
rate have been covered. Issues and challenges associated with the current
practices and potential future direction of research have also been identified.Comment: Proceedings of the 10th International Conference on Security and
Privacy in Communication Networks (SecureComm 2014
Shared Nearest-Neighbor Quantum Game-Based Attribute Reduction with Hierarchical Coevolutionary Spark and Its Application in Consistent Segmentation of Neonatal Cerebral Cortical Surfaces
© 2012 IEEE. The unprecedented increase in data volume has become a severe challenge for conventional patterns of data mining and learning systems tasked with handling big data. The recently introduced Spark platform is a new processing method for big data analysis and related learning systems, which has attracted increasing attention from both the scientific community and industry. In this paper, we propose a shared nearest-neighbor quantum game-based attribute reduction (SNNQGAR) algorithm that incorporates the hierarchical coevolutionary Spark model. We first present a shared coevolutionary nearest-neighbor hierarchy with self-evolving compensation that considers the features of nearest-neighborhood attribute subsets and calculates the similarity between attribute subsets according to the shared neighbor information of attribute sample points. We then present a novel attribute weight tensor model to generate ranking vectors of attributes and apply them to balance the relative contributions of different neighborhood attribute subsets. To optimize the model, we propose an embedded quantum equilibrium game paradigm (QEGP) to ensure that noisy attributes do not degrade the big data reduction results. A combination of the hierarchical coevolutionary Spark model and an improved MapReduce framework is then constructed that it can better parallelize the SNNQGAR to efficiently determine the preferred reduction solutions of the distributed attribute subsets. The experimental comparisons demonstrate the superior performance of the SNNQGAR, which outperforms most of the state-of-the-art attribute reduction algorithms. Moreover, the results indicate that the SNNQGAR can be successfully applied to segment overlapping and interdependent fuzzy cerebral tissues, and it exhibits a stable and consistent segmentation performance for neonatal cerebral cortical surfaces
Automatic Network Fingerprinting through Single-Node Motifs
Complex networks have been characterised by their specific connectivity
patterns (network motifs), but their building blocks can also be identified and
described by node-motifs---a combination of local network features. One
technique to identify single node-motifs has been presented by Costa et al. (L.
D. F. Costa, F. A. Rodrigues, C. C. Hilgetag, and M. Kaiser, Europhys. Lett.,
87, 1, 2009). Here, we first suggest improvements to the method including how
its parameters can be determined automatically. Such automatic routines make
high-throughput studies of many networks feasible. Second, the new routines are
validated in different network-series. Third, we provide an example of how the
method can be used to analyse network time-series. In conclusion, we provide a
robust method for systematically discovering and classifying characteristic
nodes of a network. In contrast to classical motif analysis, our approach can
identify individual components (here: nodes) that are specific to a network.
Such special nodes, as hubs before, might be found to play critical roles in
real-world networks.Comment: 16 pages (4 figures) plus supporting information 8 pages (5 figures
- …