14 research outputs found
Decoupled Contrastive Multi-view Clustering with High-order Random Walks
In recent, some robust contrastive multi-view clustering (MvC) methods have
been proposed, which construct data pairs from neighborhoods to alleviate the
false negative issue, i.e., some intra-cluster samples are wrongly treated as
negative pairs. Although promising performance has been achieved by these
methods, the false negative issue is still far from addressed and the false
positive issue emerges because all in- and out-of-neighborhood samples are
simply treated as positive and negative, respectively. To address the issues,
we propose a novel robust method, dubbed decoupled contrastive multi-view
clustering with high-order random walks (DIVIDE). In brief, DIVIDE leverages
random walks to progressively identify data pairs in a global instead of local
manner. As a result, DIVIDE could identify in-neighborhood negatives and
out-of-neighborhood positives. Moreover, DIVIDE embraces a novel MvC
architecture to perform inter- and intra-view contrastive learning in different
embedding spaces, thus boosting clustering performance and embracing the
robustness against missing views. To verify the efficacy of DIVIDE, we carry
out extensive experiments on four benchmark datasets comparing with nine
state-of-the-art MvC methods in both complete and incomplete MvC settings
Efficient and Effective Deep Multi-view Subspace Clustering
Recent multi-view subspace clustering achieves impressive results utilizing
deep networks, where the self-expressive correlation is typically modeled by a
fully connected (FC) layer. However, they still suffer from two limitations. i)
The parameter scale of the FC layer is quadratic to sample numbers, resulting
in high time and memory costs that significantly degrade their feasibility in
large-scale datasets. ii) It is under-explored to extract a unified
representation that simultaneously satisfies minimal sufficiency and
discriminability. To this end, we propose a novel deep framework, termed
Efficient and Effective deep Multi-View Subspace Clustering (EMVSC).
Instead of a parameterized FC layer, we design a Relation-Metric Net that
decouples network parameter scale from sample numbers for greater computational
efficiency. Most importantly, the proposed method devises a multi-type
auto-encoder to explicitly decouple consistent, complementary, and superfluous
information from every view, which is supervised by a soft clustering
assignment similarity constraint. Following information bottleneck theory and
the maximal coding rate reduction principle, a sufficient yet minimal unified
representation can be obtained, as well as pursuing intra-cluster aggregation
and inter-cluster separability within it. Extensive experiments show that
EMVSC yields comparable results to existing methods and achieves
state-of-the-art performance in various types of multi-view datasets
SCOREH+: A High-Order Node Proximity Spectral Clustering on Ratios-of-Eigenvectors Algorithm for Community Detection
The research on complex networks has achieved significant progress in
revealing the mesoscopic features of networks. Community detection is an
important aspect of understanding real-world complex systems. We present in
this paper a High-order node proximity Spectral Clustering on
Ratios-of-Eigenvectors (SCOREH+) algorithm for locating communities in complex
networks. The algorithm improves SCORE and SCORE+ and preserves high-order
transitivity information of the network affinity matrix. We optimize the
high-order proximity matrix from the initial affinity matrix using the Radial
Basis Functions (RBFs) and Katz index. In addition to the optimization of the
Laplacian matrix, we implement a procedure that joins an additional eigenvector
(the leading eigenvector) to the spectrum domain for clustering if
the network is considered to be a "weak signal" graph. The algorithm has been
successfully applied to both real-world and synthetic data sets. The proposed
algorithm is compared with state-of-art algorithms, such as ASE, Louvain,
Fast-Greedy, Spectral Clustering (SC), SCORE, and SCORE+. To demonstrate the
high efficacy of the proposed method, we conducted comparison experiments on
eleven real-world networks and a number of synthetic networks with noise. The
experimental results in most of these networks demonstrate that SCOREH+
outperforms the baseline methods. Moreover, by tuning the RBFs and their
shaping parameters, we may generate state-of-the-art community structures on
all real-world networks and even on noisy synthetic networks
To Compress or Not to Compress -- Self-Supervised Learning and Information Theory: A Review
Deep neural networks have demonstrated remarkable performance in supervised
learning tasks but require large amounts of labeled data. Self-supervised
learning offers an alternative paradigm, enabling the model to learn from data
without explicit labels. Information theory has been instrumental in
understanding and optimizing deep neural networks. Specifically, the
information bottleneck principle has been applied to optimize the trade-off
between compression and relevant information preservation in supervised
settings. However, the optimal information objective in self-supervised
learning remains unclear. In this paper, we review various approaches to
self-supervised learning from an information-theoretic standpoint and present a
unified framework that formalizes the \textit{self-supervised
information-theoretic learning problem}. We integrate existing research into a
coherent framework, examine recent self-supervised methods, and identify
research opportunities and challenges. Moreover, we discuss empirical
measurement of information-theoretic quantities and their estimators. This
paper offers a comprehensive review of the intersection between information
theory, self-supervised learning, and deep neural networks
Agglomerative Neural Networks for Multiview Clustering
Conventional multi-view clustering methods seek for a view consensus through
minimizing the pairwise discrepancy between the consensus and subviews.
However, the pairwise comparison cannot portray the inter-view relationship
precisely if some of the subviews can be further agglomerated. To address the
above challenge, we propose the agglomerative analysis to approximate the
optimal consensus view, thereby describing the subview relationship within a
view structure. We present Agglomerative Neural Network (ANN) based on
Constrained Laplacian Rank to cluster multi-view data directly while avoiding a
dedicated postprocessing step (e.g., using K-means). We further extend ANN with
learnable data space to handle data of complex scenarios. Our evaluations
against several state-of-the-art multi-view clustering approaches on four
popular datasets show the promising view-consensus analysis ability of ANN. We
further demonstrate ANN's capability in analyzing complex view structures and
extensibility in our case study and explain its robustness and effectiveness of
data-driven modifications