530 research outputs found
Semi-supervised evidential label propagation algorithm for graph data
International audienceIn the task of community detection, there often exists some useful prior information. In this paper, a Semi-supervised clustering approach using a new Evidential Label Propagation strategy (SELP) is proposed to incorporate the domain knowledge into the community detection model. The main advantage of SELP is that it can take limited supervised knowledge to guide the detection process. The prior information of community labels is expressed in the form of mass functions initially. Then a new evidential label propagation rule is adopted to propagate the labels from labeled data to unlabeled ones. The outliers can be identified to be in a special class. The experimental results demonstrate the effectiveness of SELP
Few-shot Image Classification based on Gradual Machine Learning
Few-shot image classification aims to accurately classify unlabeled images
using only a few labeled samples. The state-of-the-art solutions are built by
deep learning, which focuses on designing increasingly complex deep backbones.
Unfortunately, the task remains very challenging due to the difficulty of
transferring the knowledge learned in training classes to new ones. In this
paper, we propose a novel approach based on the non-i.i.d paradigm of gradual
machine learning (GML). It begins with only a few labeled observations, and
then gradually labels target images in the increasing order of hardness by
iterative factor inference in a factor graph. Specifically, our proposed
solution extracts indicative feature representations by deep backbones, and
then constructs both unary and binary factors based on the extracted features
to facilitate gradual learning. The unary factors are constructed based on
class center distance in an embedding space, while the binary factors are
constructed based on k-nearest neighborhood. We have empirically validated the
performance of the proposed approach on benchmark datasets by a comparative
study. Our extensive experiments demonstrate that the proposed approach can
improve the SOTA performance by 1-5% in terms of accuracy. More notably, it is
more robust than the existing deep models in that its performance can
consistently improve as the size of query set increases while the performance
of deep models remains essentially flat or even becomes worse.Comment: 17 pages,6 figures,5 tables, 55 conference
Uncertainty-guided Boundary Learning for Imbalanced Social Event Detection
Real-world social events typically exhibit a severe class-imbalance
distribution, which makes the trained detection model encounter a serious
generalization challenge. Most studies solve this problem from the frequency
perspective and emphasize the representation or classifier learning for tail
classes. While in our observation, compared to the rarity of classes, the
calibrated uncertainty estimated from well-trained evidential deep learning
networks better reflects model performance. To this end, we propose a novel
uncertainty-guided class imbalance learning framework - UCL, and its
variant - UCL-EC, for imbalanced social event detection tasks. We aim
to improve the overall model performance by enhancing model generalization to
those uncertain classes. Considering performance degradation usually comes from
misclassifying samples as their confusing neighboring classes, we focus on
boundary learning in latent space and classifier learning with high-quality
uncertainty estimation. First, we design a novel uncertainty-guided contrastive
learning loss, namely UCL and its variant - UCL-EC, to manipulate
distinguishable representation distribution for imbalanced data. During
training, they force all classes, especially uncertain ones, to adaptively
adjust a clear separable boundary in the feature space. Second, to obtain more
robust and accurate class uncertainty, we combine the results of multi-view
evidential classifiers via the Dempster-Shafer theory under the supervision of
an additional calibration method. We conduct experiments on three severely
imbalanced social event datasets including Events2012\_100, Events2018\_100,
and CrisisLexT\_7. Our model significantly improves social event representation
and classification tasks in almost all classes, especially those uncertain
ones.Comment: Accepted by TKDE 202
A Bibliographic View on Constrained Clustering
A keyword search on constrained clustering on Web-of-Science returned just
under 3,000 documents. We ran automatic analyses of those, and compiled our own
bibliography of 183 papers which we analysed in more detail based on their
topic and experimental study, if any. This paper presents general trends of the
area and its sub-topics by Pareto analysis, using citation count and year of
publication. We list available software and analyse the experimental sections
of our reference collection. We found a notable lack of large comparison
experiments. Among the topics we reviewed, applications studies were most
abundant recently, alongside deep learning, active learning and ensemble
learning.Comment: 18 pages, 11 figures, 177 reference
A Graph Is More Than Its Nodes: Towards Structured Uncertainty-Aware Learning on Graphs
Current graph neural networks (GNNs) that tackle node classification on
graphs tend to only focus on nodewise scores and are solely evaluated by
nodewise metrics. This limits uncertainty estimation on graphs since nodewise
marginals do not fully characterize the joint distribution given the graph
structure. In this work, we propose novel edgewise metrics, namely the edgewise
expected calibration error (ECE) and the agree/disagree ECEs, which provide
criteria for uncertainty estimation on graphs beyond the nodewise setting. Our
experiments demonstrate that the proposed edgewise metrics can complement the
nodewise results and yield additional insights. Moreover, we show that GNN
models which consider the structured prediction problem on graphs tend to have
better uncertainty estimations, which illustrates the benefit of going beyond
the nodewise setting.Comment: Presented at NeurIPS 2022 New Frontiers in Graph Learning Workshop
(NeurIPS GLFrontiers 2022
Label Propagation with Weak Supervision
Semi-supervised learning and weakly supervised learning are important
paradigms that aim to reduce the growing demand for labeled data in current
machine learning applications. In this paper, we introduce a novel analysis of
the classical label propagation algorithm (LPA) (Zhu & Ghahramani, 2002) that
moreover takes advantage of useful prior information, specifically
probabilistic hypothesized labels on the unlabeled data. We provide an error
bound that exploits both the local geometric properties of the underlying graph
and the quality of the prior information. We also propose a framework to
incorporate multiple sources of noisy information. In particular, we consider
the setting of weak supervision, where our sources of information are weak
labelers. We demonstrate the ability of our approach on multiple benchmark
weakly supervised classification tasks, showing improvements upon existing
semi-supervised and weakly supervised methods.Comment: 26 pages, 2 figure
Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning
The spread of rumors along with breaking events seriously hinders the truth
in the era of social media. Previous studies reveal that due to the lack of
annotated resources, rumors presented in minority languages are hard to be
detected. Furthermore, the unforeseen breaking events not involved in
yesterday's news exacerbate the scarcity of data resources. In this work, we
propose a novel zero-shot framework based on prompt learning to detect rumors
falling in different domains or presented in different languages. More
specifically, we firstly represent rumor circulated on social media as diverse
propagation threads, then design a hierarchical prompt encoding mechanism to
learn language-agnostic contextual representations for both prompts and rumor
data. To further enhance domain adaptation, we model the domain-invariant
structural features from the propagation threads, to incorporate structural
position representations of influential community response. In addition, a new
virtual response augmentation method is used to improve model training.
Extensive experiments conducted on three real-world datasets demonstrate that
our proposed model achieves much better performance than state-of-the-art
methods and exhibits a superior capacity for detecting rumors at early stages.Comment: AAAI 202
- …