10,791 research outputs found

    Self-adaptive GA, quantitative semantic similarity measures and ontology-based text clustering

    Get PDF
    As the common clustering algorithms use vector space model (VSM) to represent document, the conceptual relationships between related terms which do not co-occur literally are ignored. A genetic algorithm-based clustering technique, named GA clustering, in conjunction with ontology is proposed in this article to overcome this problem. In general, the ontology measures can be partitioned into two categories: thesaurus-based methods and corpus-based methods. We take advantage of the hierarchical structure and the broad coverage taxonomy of Wordnet as the thesaurus-based ontology. However, the corpus-based method is rather complicated to handle in practical application. We propose a transformed latent semantic analysis (LSA) model as the corpus-based method in this paper. Moreover, two hybrid strategies, the combinations of the various similarity measures, are implemented in the clustering experiments. The results show that our GA clustering algorithm, in conjunction with the thesaurus-based and the LSA-based method, apparently outperforms that with other similarity measures. Moreover, the superiority of the GA clustering algorithm proposed over the commonly used k-means algorithm and the standard GA is demonstrated by the improvements of the clustering performance

    A maximal clique based multiobjective evolutionary algorithm for overlapping community detection

    Get PDF
    Detecting community structure has become one im-portant technique for studying complex networks. Although many community detection algorithms have been proposed, most of them focus on separated communities, where each node can be-long to only one community. However, in many real-world net-works, communities are often overlapped with each other. De-veloping overlapping community detection algorithms thus be-comes necessary. Along this avenue, this paper proposes a maxi-mal clique based multiobjective evolutionary algorithm for over-lapping community detection. In this algorithm, a new represen-tation scheme based on the introduced maximal-clique graph is presented. Since the maximal-clique graph is defined by using a set of maximal cliques of original graph as nodes and two maximal cliques are allowed to share the same nodes of the original graph, overlap is an intrinsic property of the maximal-clique graph. Attributing to this property, the new representation scheme al-lows multiobjective evolutionary algorithms to handle the over-lapping community detection problem in a way similar to that of the separated community detection, such that the optimization problems are simplified. As a result, the proposed algorithm could detect overlapping community structure with higher partition accuracy and lower computational cost when compared with the existing ones. The experiments on both synthetic and real-world networks validate the effectiveness and efficiency of the proposed algorithm

    Identifying component modules

    Get PDF
    A computer-based system for modelling component dependencies and identifying component modules is presented. A variation of the Dependency Structure Matrix (DSM) representation was used to model component dependencies. The system utilises a two-stage approach towards facilitating the identification of a hierarchical modular structure. The first stage calculates a value for a clustering criterion that may be used to group component dependencies together. A Genetic Algorithm is described to optimise the order of the components within the DSM with the focus of minimising the value of the clustering criterion to identify the most significant component groupings (modules) within the product structure. The second stage utilises a 'Module Strength Indicator' (MSI) function to determine a value representative of the degree of modularity of the component groupings. The application of this function to the DSM produces a 'Module Structure Matrix' (MSM) depicting the relative modularity of available component groupings within it. The approach enabled the identification of hierarchical modularity in the product structure without the requirement for any additional domain specific knowledge within the system. The system supports design by providing mechanisms to explicitly represent and utilise component and dependency knowledge to facilitate the nontrivial task of determining near-optimal component modules and representing product modularity
    • …
    corecore