10 research outputs found

    Fusion and community detection in multi-layer graphs

    Get PDF
    Relational data arising in many domains can be represented by networks (or graphs) with nodes capturing entities and edges representing relationships between these entities. Community detection in networks has become one of the most important problems having a broad range of applications. Until recently, the vast majority of papers have focused on discovering community structures in a single network. However, with the emergence of multi-view network data in many real-world applications and consequently with the advent of multilayer graph representation, community detection in multi-layer graphs has become a new challenge. Multi-layer graphs provide complementary views of connectivity patterns of the same set of vertices. Fusion of the network layers is expected to achieve better clustering performance. In this paper, we propose two novel methods, coined as WSSNMTF (Weighted Simultaneous Symmetric Non-Negative Matrix Tri-Factorization) and NG-WSSNMTF (Natural Gradient WSSNMTF), for fusion and clustering of multi-layer graphs. Both methods are robust with respect to missing edges and noise. We compare the performance of the proposed methods with two baseline methods, as well as with three state-of-the-art methods on synthetic and three real-world datasets. The experimental results indicate superior performance of the proposed methods

    Fusion and community detection in multi-layer graphs

    Get PDF
    Relational data arising in many domains can be represented by networks (or graphs) with nodes capturing entities and edges representing relationships between these entities. Community detection in networks has become one of the most important problems having a broad range of applications. Until recently, the vast majority of papers have focused on discovering community structures in a single network. However, with the emergence of multi-view network data in many real-world applications and consequently with the advent of multilayer graph representation, community detection in multi-layer graphs has become a new challenge. Multi-layer graphs provide complementary views of connectivity patterns of the same set of vertices. Fusion of the network layers is expected to achieve better clustering performance. In this paper, we propose two novel methods, coined as WSSNMTF (Weighted Simultaneous Symmetric Non-Negative Matrix Tri-Factorization) and NG-WSSNMTF (Natural Gradient WSSNMTF), for fusion and clustering of multi-layer graphs. Both methods are robust with respect to missing edges and noise. We compare the performance of the proposed methods with two baseline methods, as well as with three state-of-the-art methods on synthetic and three real-world datasets. The experimental results indicate superior performance of the proposed methods

    Multi-Source Multi-View Clustering via Discrepancy Penalty

    Full text link
    With the advance of technology, entities can be observed in multiple views. Multiple views containing different types of features can be used for clustering. Although multi-view clustering has been successfully applied in many applications, the previous methods usually assume the complete instance mapping between different views. In many real-world applications, information can be gathered from multiple sources, while each source can contain multiple views, which are more cohesive for learning. The views under the same source are usually fully mapped, but they can be very heterogeneous. Moreover, the mappings between different sources are usually incomplete and partially observed, which makes it more difficult to integrate all the views across different sources. In this paper, we propose MMC (Multi-source Multi-view Clustering), which is a framework based on collective spectral clustering with a discrepancy penalty across sources, to tackle these challenges. MMC has several advantages compared with other existing methods. First, MMC can deal with incomplete mapping between sources. Second, it considers the disagreements between sources while treating views in the same source as a cohesive set. Third, MMC also tries to infer the instance similarities across sources to enhance the clustering performance. Extensive experiments conducted on real-world data demonstrate the effectiveness of the proposed approach

    Multi-View Multiple Clusterings using Deep Matrix Factorization

    Full text link
    Multi-view clustering aims at integrating complementary information from multiple heterogeneous views to improve clustering results. Existing multi-view clustering solutions can only output a single clustering of the data. Due to their multiplicity, multi-view data, can have different groupings that are reasonable and interesting from different perspectives. However, how to find multiple, meaningful, and diverse clustering results from multi-view data is still a rarely studied and challenging topic in multi-view clustering and multiple clusterings. In this paper, we introduce a deep matrix factorization based solution (DMClusts) to discover multiple clusterings. DMClusts gradually factorizes multi-view data matrices into representational subspaces layer-by-layer and generates one clustering in each layer. To enforce the diversity between generated clusterings, it minimizes a new redundancy quantification term derived from the proximity between samples in these subspaces. We further introduce an iterative optimization procedure to simultaneously seek multiple clusterings with quality and diversity. Experimental results on benchmark datasets confirm that DMClusts outperforms state-of-the-art multiple clustering solutions

    Clustering Service Networks with Entity, Attribute, and Link Heterogeneity

    Get PDF
    Many popular web service networks are content-rich in terms of heterogeneous types of entities and links, associated with incomplete attributes. Clustering such heterogeneous service networks demands new clustering techniques that can handle two heterogeneity challenges: (1) multiple types of entities co-exist in the same service network with multiple attributes, and (2) links between entities have diverse types and carry different semantics. Existing heterogeneous graph clustering techniques tend to pick initial centroids uniformly at random, specify the number k of clusters in advance, and fix k during the clustering process. In this paper, we propose Service Cluster, a novel heterogeneous service network clustering algorithm with four unique features. First, we incorporate various types of entity, attribute and link information into a unified distance measure. Second, we design a Discrete Steepest Descent method to naturally produce initial k and initial centroids simultaneously. Third, we propose a dynamic learning method to automatically adjust the link weights towards clustering convergence. Fourth, we develop an effective optimization strategy to identify new suitable k and k well-chosen centroids at each clustering iteration. Extensive evaluation on real datasets demonstrates that Service Cluster outperforms existing representative methods in terms of both effectiveness and efficiency

    New Approaches in Multi-View Clustering

    Get PDF
    Many real-world datasets can be naturally described by multiple views. Due to this, multi-view learning has drawn much attention from both academia and industry. Compared to single-view learning, multi-view learning has demonstrated plenty of advantages. Clustering has long been serving as a critical technique in data mining and machine learning. Recently, multi-view clustering has achieved great success in various applications. To provide a comprehensive review of the typical multi-view clustering methods and their corresponding recent developments, this chapter summarizes five kinds of popular clustering methods and their multi-view learning versions, which include k-means, spectral clustering, matrix factorization, tensor decomposition, and deep learning. These clustering methods are the most widely employed algorithms for single-view data, and lots of efforts have been devoted to extending them for multi-view clustering. Besides, many other multi-view clustering methods can be unified into the frameworks of these five methods. To promote further research and development of multi-view clustering, some popular and open datasets are summarized in two categories. Furthermore, several open issues that deserve more exploration are pointed out in the end

    On relational learning and discovery in social networks: a survey

    Get PDF
    The social networking scene has evolved tremendously over the years. It has grown in relational complexities that extend a vast presence onto popular social media platforms on the internet. With the advance of sentimental computing and social complexity, relationships which were once thought to be simple have now become multi-dimensional and widespread in the online scene. This explosion in the online social scene has attracted much research attention. The main aims of this work revolve around the knowledge discovery and datamining processes of these feature-rich relations. In this paper, we provide a survey of relational learning and discovery through popular social analysis of different structure types which are integral to applications within the emerging field of sentimental and affective computing. It is hoped that this contribution will add to the clarity of how social networks are analyzed with the latest groundbreaking methods and provide certain directions for future improvements

    Flexible and robust co-regularized multi-domain graph clustering

    No full text
    Multi-view graph clustering aims to enhance clustering performance by integrating heterogeneous information collected in different do-mains. Each domain provides a different view of the data instances. Leveraging cross-domain information has been demonstrated an ef-fective way to achieve better clustering results. Despite the previ-ous success, existing multi-view graph clustering methods usually assume that different views are available for the same set of in-stances. Thus instances in different domains can be treated as hav-ing strict one-to-one relationship. In many real-life applications, however, data instances in one domain may correspond to multiple instances in another domain. Moreover, relationships between in-stances in different domains may be associated with weights based on prior (partial) knowledge. In this paper, we propose a flexible and robust framework, CGC (Co-regularized Graph Clustering), based on non-negative matrix factorization (NMF), to tackle these challenges. CGC has several advantages over the existing method-s. First, it supports many-to-many cross-domain instance relation-ship. Second, it incorporates weight on cross-domain relationship. Third, it allows partial cross-domain mapping so that graphs in dif-ferent domains may have different sizes. Finally, it provides users with the extent to which the cross-domain instance relationship vi-olates the in-domain clustering structure, and thus enables users to re-evaluate the consistency of the relationship. Extensive experi-mental results on UCI benchmark data sets, newsgroup data sets and biological interaction networks demonstrate the effectiveness of our approach
    corecore