16,201 research outputs found

    Median evidential c-means algorithm and its application to community detection

    Get PDF
    Median clustering is of great value for partitioning relational data. In this paper, a new prototype-based clustering method, called Median Evidential C-Means (MECM), which is an extension of median c-means and median fuzzy c-means on the theoretical framework of belief functions is proposed. The median variant relaxes the restriction of a metric space embedding for the objects but constrains the prototypes to be in the original data set. Due to these properties, MECM could be applied to graph clustering problems. A community detection scheme for social networks based on MECM is investigated and the obtained credal partitions of graphs, which are more refined than crisp and fuzzy ones, enable us to have a better understanding of the graph structures. An initial prototype-selection scheme based on evidential semi-centrality is presented to avoid local premature convergence and an evidential modularity function is defined to choose the optimal number of communities. Finally, experiments in synthetic and real data sets illustrate the performance of MECM and show its difference to other methods

    A similarity-based community detection method with multiple prototype representation

    Get PDF
    Communities are of great importance for understanding graph structures in social networks. Some existing community detection algorithms use a single prototype to represent each group. In real applications, this may not adequately model the different types of communities and hence limits the clustering performance on social networks. To address this problem, a Similarity-based Multi-Prototype (SMP) community detection approach is proposed in this paper. In SMP, vertices in each community carry various weights to describe their degree of representativeness. This mechanism enables each community to be represented by more than one node. The centrality of nodes is used to calculate prototype weights, while similarity is utilized to guide us to partitioning the graph. Experimental results on computer generated and real-world networks clearly show that SMP performs well for detecting communities. Moreover, the method could provide richer information for the inner structure of the detected communities with the help of prototype weights compared with the existing community detection models

    Modularity functions maximization with nonnegative relaxation facilitates community detection in networks

    Full text link
    We show here that the problem of maximizing a family of quantitative functions, encompassing both the modularity (Q-measure) and modularity density (D-measure), for community detection can be uniformly understood as a combinatoric optimization involving the trace of a matrix called modularity Laplacian. Instead of using traditional spectral relaxation, we apply additional nonnegative constraint into this graph clustering problem and design efficient algorithms to optimize the new objective. With the explicit nonnegative constraint, our solutions are very close to the ideal community indicator matrix and can directly assign nodes into communities. The near-orthogonal columns of the solution can be reformulated as the posterior probability of corresponding node belonging to each community. Therefore, the proposed method can be exploited to identify the fuzzy or overlapping communities and thus facilitates the understanding of the intrinsic structure of networks. Experimental results show that our new algorithm consistently, sometimes significantly, outperforms the traditional spectral relaxation approaches

    Identifying Overlapping and Hierarchical Thematic Structures in Networks of Scholarly Papers: A Comparison of Three Approaches

    Get PDF
    We implemented three recently proposed approaches to the identification of overlapping and hierarchical substructures in graphs and applied the corresponding algorithms to a network of 492 information-science papers coupled via their cited sources. The thematic substructures obtained and overlaps produced by the three hierarchical cluster algorithms were compared to a content-based categorisation, which we based on the interpretation of titles and keywords. We defined sets of papers dealing with three topics located on different levels of aggregation: h-index, webometrics, and bibliometrics. We identified these topics with branches in the dendrograms produced by the three cluster algorithms and compared the overlapping topics they detected with one another and with the three pre-defined paper sets. We discuss the advantages and drawbacks of applying the three approaches to paper networks in research fields.Comment: 18 pages, 9 figure

    Searching for network modules

    Full text link
    When analyzing complex networks a key target is to uncover their modular structure, which means searching for a family of modules, namely node subsets spanning each a subnetwork more densely connected than the average. This work proposes a novel type of objective function for graph clustering, in the form of a multilinear polynomial whose coefficients are determined by network topology. It may be thought of as a potential function, to be maximized, taking its values on fuzzy clusterings or families of fuzzy subsets of nodes over which every node distributes a unit membership. When suitably parametrized, this potential is shown to attain its maximum when every node concentrates its all unit membership on some module. The output thus is a partition, while the original discrete optimization problem is turned into a continuous version allowing to conceive alternative search strategies. The instance of the problem being a pseudo-Boolean function assigning real-valued cluster scores to node subsets, modularity maximization is employed to exemplify a so-called quadratic form, in that the scores of singletons and pairs also fully determine the scores of larger clusters, while the resulting multilinear polynomial potential function has degree 2. After considering further quadratic instances, different from modularity and obtained by interpreting network topology in alternative manners, a greedy local-search strategy for the continuous framework is analytically compared with an existing greedy agglomerative procedure for the discrete case. Overlapping is finally discussed in terms of multiple runs, i.e. several local searches with different initializations.Comment: 10 page

    Evidential Communities for Complex Networks

    Get PDF
    Community detection is of great importance for understand-ing graph structure in social networks. The communities in real-world networks are often overlapped, i.e. some nodes may be a member of multiple clusters. How to uncover the overlapping communities/clusters in a complex network is a general problem in data mining of network data sets. In this paper, a novel algorithm to identify overlapping communi-ties in complex networks by a combination of an evidential modularity function, a spectral mapping method and evidential c-means clustering is devised. Experimental results indicate that this detection approach can take advantage of the theory of belief functions, and preforms good both at detecting community structure and determining the appropri-ate number of clusters. Moreover, the credal partition obtained by the proposed method could give us a deeper insight into the graph structure

    Evidential relational clustering using medoids

    Get PDF
    In real clustering applications, proximity data, in which only pairwise similarities or dissimilarities are known, is more general than object data, in which each pattern is described explicitly by a list of attributes. Medoid-based clustering algorithms, which assume the prototypes of classes are objects, are of great value for partitioning relational data sets. In this paper a new prototype-based clustering method, named Evidential C-Medoids (ECMdd), which is an extension of Fuzzy C-Medoids (FCMdd) on the theoretical framework of belief functions is proposed. In ECMdd, medoids are utilized as the prototypes to represent the detected classes, including specific classes and imprecise classes. Specific classes are for the data which are distinctly far from the prototypes of other classes, while imprecise classes accept the objects that may be close to the prototypes of more than one class. This soft decision mechanism could make the clustering results more cautious and reduce the misclassification rates. Experiments in synthetic and real data sets are used to illustrate the performance of ECMdd. The results show that ECMdd could capture well the uncertainty in the internal data structure. Moreover, it is more robust to the initializations compared with FCMdd.Comment: in The 18th International Conference on Information Fusion, July 2015, Washington, DC, USA , Jul 2015, Washington, United State
    • 

    corecore