129 research outputs found
Non-linear Attributed Graph Clustering by Symmetric NMF with PU Learning
We consider the clustering problem of attributed graphs. Our challenge is how
we can design an effective and efficient clustering method that precisely
captures the hidden relationship between the topology and the attributes in
real-world graphs. We propose Non-linear Attributed Graph Clustering by
Symmetric Non-negative Matrix Factorization with Positive Unlabeled Learning.
The features of our method are three holds. 1) it learns a non-linear
projection function between the different cluster assignments of the topology
and the attributes of graphs so as to capture the complicated relationship
between the topology and the attributes in real-world graphs, 2) it leverages
the positive unlabeled learning to take the effect of partially observed
positive edges into the cluster assignment, and 3) it achieves efficient
computational complexity, , where is the vertex size, is
the attribute size, is the number of clusters, and is the number of
iterations for learning the cluster assignment. We conducted experiments
extensively for various clustering methods with various real datasets to
validate that our method outperforms the former clustering methods regarding
the clustering quality
Asymmetric double-winged multi-view clustering network for exploring Diverse and Consistent Information
In unsupervised scenarios, deep contrastive multi-view clustering (DCMVC) is
becoming a hot research spot, which aims to mine the potential relationships
between different views. Most existing DCMVC algorithms focus on exploring the
consistency information for the deep semantic features, while ignoring the
diverse information on shallow features. To fill this gap, we propose a novel
multi-view clustering network termed CodingNet to explore the diverse and
consistent information simultaneously in this paper. Specifically, instead of
utilizing the conventional auto-encoder, we design an asymmetric structure
network to extract shallow and deep features separately. Then, by aligning the
similarity matrix on the shallow feature to the zero matrix, we ensure the
diversity for the shallow features, thus offering a better description of
multi-view data. Moreover, we propose a dual contrastive mechanism that
maintains consistency for deep features at both view-feature and pseudo-label
levels. Our framework's efficacy is validated through extensive experiments on
six widely used benchmark datasets, outperforming most state-of-the-art
multi-view clustering algorithms
Integration of multi-scale protein interactions for biomedical data analysis
With the advancement of modern technologies, we observe an increasing accumulation of biomedical data about diseases. There is a need for computational methods to sift through and extract knowledge from the diverse data available in order to improve our mechanistic understanding of diseases and improve patient care. Biomedical data come in various forms as exemplified by the various omics data. Existing studies have shown that each form of omics data gives only partial information on cells state and motivated jointly mining multi-omics, multi-modal data to extract integrated system knowledge. The interactome is of particular importance as it enables the modelling of dependencies arising from molecular interactions. This Thesis takes a special interest in the multi-scale protein interactome and its integration with computational models to extract relevant information from biomedical data. We define multi-scale interactions at different omics scale that involve proteins: pairwise protein-protein interactions, multi-protein complexes, and biological pathways. Using hypergraph representations, we motivate considering higher-order protein interactions, highlighting the complementary biological information contained in the multi-scale interactome. Based on those results, we further investigate how those multi-scale protein interactions can be used as either prior knowledge, or auxiliary data to develop machine learning algorithms. First, we design a neural network using the multi-scale organization of proteins in a cell into biological pathways as prior knowledge and train it to predict a patient's diagnosis based on transcriptomics data. From the trained models, we develop a strategy to extract biomedical knowledge pertaining to the diseases investigated. Second, we propose a general framework based on Non-negative Matrix Factorization to integrate the multi-scale protein interactome with multi-omics data. We show that our approach outperforms the existing methods, provide biomedical insights and relevant hypotheses for specific cancer types
- …