4,140 research outputs found
Rethinking the Learning Paradigm for Facial Expression Recognition
Due to the subjective crowdsourcing annotations and the inherent inter-class
similarity of facial expressions, the real-world Facial Expression Recognition
(FER) datasets usually exhibit ambiguous annotation. To simplify the learning
paradigm, most previous methods convert ambiguous annotation results into
precise one-hot annotations and train FER models in an end-to-end supervised
manner. In this paper, we rethink the existing training paradigm and propose
that it is better to use weakly supervised strategies to train FER models with
original ambiguous annotation
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning
In many real-world tasks, the concerned objects can be represented as a
multi-instance bag associated with a candidate label set, which consists of one
ground-truth label and several false positive labels. Multi-instance
partial-label learning (MIPL) is a learning paradigm to deal with such tasks
and has achieved favorable performances. Existing MIPL approach follows the
instance-space paradigm by assigning augmented candidate label sets of bags to
each instance and aggregating bag-level labels from instance-level labels.
However, this scheme may be suboptimal as global bag-level information is
ignored and the predicted labels of bags are sensitive to predictions of
negative instances. In this paper, we study an alternative scheme where a
multi-instance bag is embedded into a single vector representation.
Accordingly, an intuitive algorithm named DEMIPL, i.e., Disambiguated attention
Embedding for Multi-Instance Partial-Label learning, is proposed. DEMIPL
employs a disambiguation attention mechanism to aggregate a multi-instance bag
into a single vector representation, followed by a momentum-based
disambiguation strategy to identify the ground-truth label from the candidate
label set. Furthermore, we introduce a real-world MIPL dataset for colorectal
cancer classification. Experimental results on benchmark and real-world
datasets validate the superiority of DEMIPL against the compared MIPL and
partial-label learning approaches.Comment: Accepted at NeurIPS 202
Robust Representation Learning for Unreliable Partial Label Learning
Partial Label Learning (PLL) is a type of weakly supervised learning where
each training instance is assigned a set of candidate labels, but only one
label is the ground-truth. However, this idealistic assumption may not always
hold due to potential annotation inaccuracies, meaning the ground-truth may not
be present in the candidate label set. This is known as Unreliable Partial
Label Learning (UPLL) that introduces an additional complexity due to the
inherent unreliability and ambiguity of partial labels, often resulting in a
sub-optimal performance with existing methods. To address this challenge, we
propose the Unreliability-Robust Representation Learning framework (URRL) that
leverages unreliability-robust contrastive learning to help the model fortify
against unreliable partial labels effectively. Concurrently, we propose a dual
strategy that combines KNN-based candidate label set correction and
consistency-regularization-based label disambiguation to refine label quality
and enhance the ability of representation learning within the URRL framework.
Extensive experiments demonstrate that the proposed method outperforms
state-of-the-art PLL methods on various datasets with diverse degrees of
unreliability and ambiguity. Furthermore, we provide a theoretical analysis of
our approach from the perspective of the expectation maximization (EM)
algorithm. Upon acceptance, we pledge to make the code publicly accessible
- …