43,959 research outputs found

    Deep Representation-aligned Graph Multi-view Clustering for Limited Labeled Multi-modal Health Data

    Get PDF
    Today, many fields are characterised by having extensive quantities of data from a wide range of dissimilar sources and domains. One such field is medicine, in which data contain exhaustive combinations of spatial, temporal, linear, and relational data. Often lacking expert-assessed labels, much of this data would require analysis within the fields of unsupervised or semi-supervised learning. Thus, reasoned by the notion that higher view-counts provide more ways to recognise commonality across views, contrastive multi-view clustering may be utilised to train a model to suppress redundancy and otherwise medically irrelevant information. Yet, standard multi-view clustering approaches do not account for relational graph data. Recent developments aim to solve this by utilising various graph operations including graph-based attention. And within deep-learning graph-based multi-view clustering on a sole view-invariant affinity graph, representation alignment remains unexplored. We introduce Deep Representation-Aligned Graph Multi-View Clustering (DRAGMVC), a novel attention-based graph multi-view clustering model. Comparing maximal performance, our model surpassed the state-of-the-art in eleven out of twelve metrics on Cora, CiteSeer, and PubMed. The model considers view alignment on a sample-level by employing contrastive loss and relational data through a novel take on graph attention embeddings in which we use a Markov chain prior to increase the receptive field of each layer. For clustering, a graph-induced DDC module is used. GraphSAINT sampling is implemented to control our mini-batch space to capitalise on our Markov prior. Additionally, we present the MIMIC pleural effusion graph multi-modal dataset, consisting of two modalities registering 3520 chest X-ray images along with two static views registered within a one-day time frame: vital signs and lab tests. These making up the, in total, three views of the dataset. We note a significant improvement in terms of separability, view mixing, and clustering performance comparing DRAGMVC to preceding non-graph multi-view clustering models, suggesting a possible, largely unexplored use case of unsupervised graph multi-view clustering on graph-induced, multi-modal, and complex medical data

    Bayesian nonparametric clusterings in relational and high-dimensional settings with applications in bioinformatics.

    Get PDF
    Recent advances in high throughput methodologies offer researchers the ability to understand complex systems via high dimensional and multi-relational data. One example is the realm of molecular biology where disparate data (such as gene sequence, gene expression, and interaction information) are available for various snapshots of biological systems. This type of high dimensional and multirelational data allows for unprecedented detailed analysis, but also presents challenges in accounting for all the variability. High dimensional data often has a multitude of underlying relationships, each represented by a separate clustering structure, where the number of structures is typically unknown a priori. To address the challenges faced by traditional clustering methods on high dimensional and multirelational data, we developed three feature selection and cross-clustering methods: 1) infinite relational model with feature selection (FIRM) which incorporates the rich information of multirelational data; 2) Bayesian Hierarchical Cross-Clustering (BHCC), a deterministic approximation to Cross Dirichlet Process mixture (CDPM) and to cross-clustering; and 3) randomized approximation (RBHCC), based on a truncated hierarchy. An extension of BHCC, Bayesian Congruence Measuring (BCM), is proposed to measure incongruence between genes and to identify sets of congruent loci with identical evolutionary histories. We adapt our BHCC algorithm to the inference of BCM, where the intended structure of each view (congruent loci) represents consistent evolutionary processes. We consider an application of FIRM on categorizing mRNA and microRNA. The model uses latent structures to encode the expression pattern and the gene ontology annotations. We also apply FIRM to recover the categories of ligands and proteins, and to predict unknown drug-target interactions, where latent categorization structure encodes drug-target interaction, chemical compound similarity, and amino acid sequence similarity. BHCC and RBHCC are shown to have improved predictive performance (both in terms of cluster membership and missing value prediction) compared to traditional clustering methods. Our results suggest that these novel approaches to integrating multi-relational information have a promising future in the biological sciences where incorporating data related to varying features is often regarded as a daunting task

    Fusion and community detection in multi-layer graphs

    Get PDF
    Relational data arising in many domains can be represented by networks (or graphs) with nodes capturing entities and edges representing relationships between these entities. Community detection in networks has become one of the most important problems having a broad range of applications. Until recently, the vast majority of papers have focused on discovering community structures in a single network. However, with the emergence of multi-view network data in many real-world applications and consequently with the advent of multilayer graph representation, community detection in multi-layer graphs has become a new challenge. Multi-layer graphs provide complementary views of connectivity patterns of the same set of vertices. Fusion of the network layers is expected to achieve better clustering performance. In this paper, we propose two novel methods, coined as WSSNMTF (Weighted Simultaneous Symmetric Non-Negative Matrix Tri-Factorization) and NG-WSSNMTF (Natural Gradient WSSNMTF), for fusion and clustering of multi-layer graphs. Both methods are robust with respect to missing edges and noise. We compare the performance of the proposed methods with two baseline methods, as well as with three state-of-the-art methods on synthetic and three real-world datasets. The experimental results indicate superior performance of the proposed methods

    Fusion and community detection in multi-layer graphs

    Get PDF
    Relational data arising in many domains can be represented by networks (or graphs) with nodes capturing entities and edges representing relationships between these entities. Community detection in networks has become one of the most important problems having a broad range of applications. Until recently, the vast majority of papers have focused on discovering community structures in a single network. However, with the emergence of multi-view network data in many real-world applications and consequently with the advent of multilayer graph representation, community detection in multi-layer graphs has become a new challenge. Multi-layer graphs provide complementary views of connectivity patterns of the same set of vertices. Fusion of the network layers is expected to achieve better clustering performance. In this paper, we propose two novel methods, coined as WSSNMTF (Weighted Simultaneous Symmetric Non-Negative Matrix Tri-Factorization) and NG-WSSNMTF (Natural Gradient WSSNMTF), for fusion and clustering of multi-layer graphs. Both methods are robust with respect to missing edges and noise. We compare the performance of the proposed methods with two baseline methods, as well as with three state-of-the-art methods on synthetic and three real-world datasets. The experimental results indicate superior performance of the proposed methods

    Multi-view constrained clustering with an incomplete mapping between views

    Full text link
    Multi-view learning algorithms typically assume a complete bipartite mapping between the different views in order to exchange information during the learning process. However, many applications provide only a partial mapping between the views, creating a challenge for current methods. To address this problem, we propose a multi-view algorithm based on constrained clustering that can operate with an incomplete mapping. Given a set of pairwise constraints in each view, our approach propagates these constraints using a local similarity measure to those instances that can be mapped to the other views, allowing the propagated constraints to be transferred across views via the partial mapping. It uses co-EM to iteratively estimate the propagation within each view based on the current clustering model, transfer the constraints across views, and then update the clustering model. By alternating the learning process between views, this approach produces a unified clustering model that is consistent with all views. We show that this approach significantly improves clustering performance over several other methods for transferring constraints and allows multi-view clustering to be reliably applied when given a limited mapping between the views. Our evaluation reveals that the propagated constraints have high precision with respect to the true clusters in the data, explaining their benefit to clustering performance in both single- and multi-view learning scenarios
    • …
    corecore