7 research outputs found
Reliable Conflictive Multi-View Learning
Multi-view learning aims to combine multiple features to achieve more
comprehensive descriptions of data. Most previous works assume that multiple
views are strictly aligned. However, real-world multi-view data may contain
low-quality conflictive instances, which show conflictive information in
different views. Previous methods for this problem mainly focus on eliminating
the conflictive data instances by removing them or replacing conflictive views.
Nevertheless, real-world applications usually require making decisions for
conflictive instances rather than only eliminating them. To solve this, we
point out a new Reliable Conflictive Multi-view Learning (RCML) problem, which
requires the model to provide decision results and attached reliabilities for
conflictive multi-view data. We develop an Evidential Conflictive Multi-view
Learning (ECML) method for this problem. ECML first learns view-specific
evidence, which could be termed as the amount of support to each category
collected from data. Then, we can construct view-specific opinions consisting
of decision results and reliability. In the multi-view fusion stage, we propose
a conflictive opinion aggregation strategy and theoretically prove this
strategy can exactly model the relation of multi-view common and view-specific
reliabilities. Experiments performed on 6 datasets verify the effectiveness of
ECML.Comment: 9 pages and to be appeared in AAAI202
Decoupled Contrastive Multi-view Clustering with High-order Random Walks
In recent, some robust contrastive multi-view clustering (MvC) methods have
been proposed, which construct data pairs from neighborhoods to alleviate the
false negative issue, i.e., some intra-cluster samples are wrongly treated as
negative pairs. Although promising performance has been achieved by these
methods, the false negative issue is still far from addressed and the false
positive issue emerges because all in- and out-of-neighborhood samples are
simply treated as positive and negative, respectively. To address the issues,
we propose a novel robust method, dubbed decoupled contrastive multi-view
clustering with high-order random walks (DIVIDE). In brief, DIVIDE leverages
random walks to progressively identify data pairs in a global instead of local
manner. As a result, DIVIDE could identify in-neighborhood negatives and
out-of-neighborhood positives. Moreover, DIVIDE embraces a novel MvC
architecture to perform inter- and intra-view contrastive learning in different
embedding spaces, thus boosting clustering performance and embracing the
robustness against missing views. To verify the efficacy of DIVIDE, we carry
out extensive experiments on four benchmark datasets comparing with nine
state-of-the-art MvC methods in both complete and incomplete MvC settings
Investigating and Mitigating the Side Effects of Noisy Views in Multi-view Clustering in Practical Scenarios
Multi-view clustering (MvC) aims at exploring category structures among
multi-view data without label supervision. Multiple views provide more
information than single views and thus existing MvC methods can achieve
satisfactory performance. However, their performance might seriously degenerate
when the views are noisy in practical scenarios. In this paper, we first
formally investigate the drawback of noisy views and then propose a
theoretically grounded deep MvC method (namely MvCAN) to address this issue.
Specifically, we propose a novel MvC objective that enables un-shared
parameters and inconsistent clustering predictions across multiple views to
reduce the side effects of noisy views. Furthermore, a non-parametric iterative
process is designed to generate a robust learning target for mining multiple
views' useful information. Theoretical analysis reveals that MvCAN works by
achieving the multi-view consistency, complementarity, and noise robustness.
Finally, experiments on extensive public datasets demonstrate that MvCAN
outperforms state-of-the-art methods and is robust against the existence of
noisy views
Improving Representation Learning for Deep Clustering and Few-shot Learning
The amounts of data in the world have increased dramatically in recent years, and it is quickly becoming infeasible for humans to label all these data. It is therefore crucial that modern machine learning systems can operate with few or no labels. The introduction of deep learning and deep neural networks has led to impressive advancements in several areas of machine learning. These advancements are largely due to the unprecedented ability of deep neural networks to learn powerful representations from a wide range of complex input signals. This ability is especially important when labeled data is limited, as the absence of a strong supervisory signal forces models to rely more on intrinsic properties of the data and its representations.
This thesis focuses on two key concepts in deep learning with few or no labels. First, we aim to improve representation quality in deep clustering - both for single-view and multi-view data. Current models for deep clustering face challenges related to properly representing semantic similarities, which is crucial for the models to discover meaningful clusterings. This is especially challenging with multi-view data, since the information required for successful clustering might be scattered across many views. Second, we focus on few-shot learning, and how geometrical properties of representations influence few-shot classification performance. We find that a large number of recent methods for few-shot learning embed representations on the hypersphere. Hence, we seek to understand what makes the hypersphere a particularly suitable embedding space for few-shot learning.
Our work on single-view deep clustering addresses the susceptibility of deep clustering models to find trivial solutions with non-meaningful representations. To address this issue, we present a new auxiliary objective that - when compared to the popular autoencoder-based approach - better aligns with the main clustering objective, resulting in improved clustering performance. Similarly, our work on multi-view clustering focuses on how representations can be learned from multi-view data, in order to make the representations suitable for the clustering objective. Where recent methods for deep multi-view clustering have focused on aligning view-specific representations, we find that this alignment procedure might actually be detrimental to representation quality. We investigate the effects of representation alignment, and provide novel insights on when alignment is beneficial, and when it is not. Based on our findings, we present several new methods for deep multi-view clustering - both alignment and non-alignment-based - that out-perform current state-of-the-art methods.
Our first work on few-shot learning aims to tackle the hubness problem, which has been shown to have negative effects on few-shot classification performance. To this end, we present two new methods to embed representations on the hypersphere for few-shot learning. Further, we provide both theoretical and experimental evidence indicating that embedding representations as uniformly as possible on the hypersphere reduces hubness, and improves classification accuracy. Furthermore, based on our findings on hyperspherical embeddings for few-shot learning, we seek to improve the understanding of representation norms. In particular, we ask what type of information the norm carries, and why it is often beneficial to discard the norm in classification models. We answer this question by presenting a novel hypothesis on the relationship between representation norm and the number of a certain class of objects in the image. We then analyze our hypothesis both theoretically and experimentally, presenting promising results that corroborate the hypothesis