317 research outputs found
Multi-Label Self-Supervised Learning with Scene Images
Self-supervised learning (SSL) methods targeting scene images have seen a
rapid growth recently, and they mostly rely on either a dedicated dense
matching mechanism or a costly unsupervised object discovery module. This paper
shows that instead of hinging on these strenuous operations, quality image
representations can be learned by treating scene/multi-label image SSL simply
as a multi-label classification problem, which greatly simplifies the learning
framework. Specifically, multiple binary pseudo-labels are assigned for each
input image by comparing its embeddings with those in two dictionaries, and the
network is optimized using the binary cross entropy loss. The proposed method
is named Multi-Label Self-supervised learning (MLS). Visualizations
qualitatively show that clearly the pseudo-labels by MLS can automatically find
semantically similar pseudo-positive pairs across different images to
facilitate contrastive learning. MLS learns high quality representations on
MS-COCO and achieves state-of-the-art results on classification, detection and
segmentation benchmarks. At the same time, MLS is much simpler than existing
methods, making it easier to deploy and for further exploration.Comment: ICCV202
Instance-based Max-margin for Practical Few-shot Recognition
In order to mimic the human few-shot learning (FSL) ability better and to
make FSL closer to real-world applications, this paper proposes a practical FSL
(pFSL) setting. pFSL is based on unsupervised pretrained models (analogous to
human prior knowledge) and recognizes many novel classes simultaneously.
Compared to traditional FSL, pFSL is simpler in its formulation, easier to
evaluate, more challenging and more practical. To cope with the rarity of
training examples, this paper proposes IbM2, an instance-based max-margin
method not only for the new pFSL setting, but also works well in traditional
FSL scenarios. Based on the Gaussian Annulus Theorem, IbM2 converts random
noise applied to the instances into a mechanism to achieve maximum margin in
the many-way pFSL (or traditional FSL) recognition task. Experiments with
various self-supervised pretraining methods and diverse many- or few-way FSL
tasks show that IbM2 almost always leads to improvements compared to its
respective baseline methods, and in most cases the improvements are
significant. With both the new pFSL setting and novel IbM2 method, this paper
shows that practical few-shot learning is both viable and promising
Worst Case Matters for Few-Shot Recognition
Few-shot recognition learns a recognition model with very few (e.g., 1 or 5)
images per category, and current few-shot learning methods focus on improving
the average accuracy over many episodes. We argue that in real-world
applications we may often only try one episode instead of many, and hence
maximizing the worst-case accuracy is more important than maximizing the
average accuracy. We empirically show that a high average accuracy not
necessarily means a high worst-case accuracy. Since this objective is not
accessible, we propose to reduce the standard deviation and increase the
average accuracy simultaneously. In turn, we devise two strategies from the
bias-variance tradeoff perspective to implicitly reach this goal: a simple yet
effective stability regularization (SR) loss together with model ensemble to
reduce variance during fine-tuning, and an adaptability calibration mechanism
to reduce the bias. Extensive experiments on benchmark datasets demonstrate the
effectiveness of the proposed strategies, which outperforms current
state-of-the-art methods with a significant margin in terms of not only
average, but also worst-case accuracy. Our code is available at
https://github.com/heekhero/ACSR.Comment: Accepted by ECCV202
Low-rank Attention Side-Tuning for Parameter-Efficient Fine-Tuning
In finetuning a large pretrained model to downstream tasks,
parameter-efficient fine-tuning (PEFT) methods can effectively finetune
pretrained models with few trainable parameters, but suffer from high GPU
memory consumption and slow training speed. Because learnable parameters from
these methods are entangled with the pretrained model, gradients related to the
frozen pretrained model's parameters have to be computed and stored during
finetuning. We propose Low-rank Attention Side-Tuning (LAST), which
disentangles the trainable module from the pretrained model by freezing not
only parameters but also outputs of the pretrained network. LAST trains a
side-network composed of only low-rank self-attention modules. By viewing the
pretrained model as a frozen feature extractor, the side-network takes
intermediate output from the pretrained model and focus on learning
task-specific knowledge. We also show that LAST can be highly parallel across
multiple optimization objectives, making it very efficient in downstream task
adaptation, for example, in finding optimal hyperparameters. LAST outperforms
previous state-of-the-art methods on VTAB-1K and other visual adaptation tasks
with roughly only 30\% of GPU memory footprint and 60\% of training time
compared to existing PEFT methods, but achieves significantly higher accuracy
Randomized Parameterized Algorithms for the Kidney Exchange Problem
In order to increase the potential kidney transplants between patients and their incompatible donors, kidney exchange programs have been created in many countries. In the programs, designing algorithms for the kidney exchange problem plays a critical role. The graph theory model of the kidney exchange problem is to find a maximum weight packing of vertex-disjoint cycles and chains for a given weighted digraph. In general, the length of cycles is not more than a given constant L (typically 2 L 5), and the objective function corresponds to maximizing the number of possible kidney transplants. In this paper, we study the parameterized complexity and randomized algorithms for the kidney exchange problem without chains from theory. We construct two different parameterized models of the kidney exchange problem for two cases L = 3 and L 3, and propose two randomized parameterized algorithms based on the random partitioning technique and the randomized algebraic technique, respectively
Measurement framework for assessing disruptive innovations
Assessing potential disruptiveness of innovations is an important but challenging task for incumbents. However, the extant literature focuses only on technological and marketplace aspects, and most of the documented methods tend to be case specific. In this study, we present a multidimensional measurement framework to assess the disruptive potential of product innovations. The framework is designed based on the concept that the nature of disruptive innovations is multidimensional. Three aspects are considered, i.e., technological features, marketplace dynamics and external environment. Ten indicators of the three categories are proposed and then connected based on the conceptual and literature analysis. Three innovations, namely, WeChat (successful), Modularised Mobile Phone (failed) and Virtual Reality/Augmented Reality (ongoing), are selected as case studies. A panel of industrial experts with PhD degree in engineering is surveyed. The survey results are calculated and analysed according to the framework and then compared against the developments of the innovations. We also check the robustness of this framework by surveying other groups of people, and the results are nearly identical to the previous findings. This study enables a systematic assessment of disruptive potential of innovations using the framework, providing insights for decisions in product launch and resource allocation.fi=vertaisarvioitu|en=peerReviewed
Rectify the Regression Bias in Long-Tailed Object Detection
Long-tailed object detection faces great challenges because of its extremely
imbalanced class distribution. Recent methods mainly focus on the
classification bias and its loss function design, while ignoring the subtle
influence of the regression branch. This paper shows that the regression bias
exists and does adversely and seriously impact the detection accuracy. While
existing methods fail to handle the regression bias, the class-specific
regression head for rare classes is hypothesized to be the main cause of it in
this paper. As a result, three kinds of viable solutions to cater for the rare
categories are proposed, including adding a class-agnostic branch, clustering
heads and merging heads. The proposed methods brings in consistent and
significant improvements over existing long-tailed detection methods,
especially in rare and common classes. The proposed method achieves
state-of-the-art performance in the large vocabulary LVIS dataset with
different backbones and architectures. It generalizes well to more difficult
evaluation metrics, relatively balanced datasets, and the mask branch. This is
the first attempt to reveal and explore rectifying of the regression bias in
long-tailed object detection
Hyperbolic Geometric Graph Representation Learning for Hierarchy-imbalance Node Classification
Learning unbiased node representations for imbalanced samples in the graph
has become a more remarkable and important topic. For the graph, a significant
challenge is that the topological properties of the nodes (e.g., locations,
roles) are unbalanced (topology-imbalance), other than the number of training
labeled nodes (quantity-imbalance). Existing studies on topology-imbalance
focus on the location or the local neighborhood structure of nodes, ignoring
the global underlying hierarchical properties of the graph, i.e., hierarchy. In
the real-world scenario, the hierarchical structure of graph data reveals
important topological properties of graphs and is relevant to a wide range of
applications. We find that training labeled nodes with different hierarchical
properties have a significant impact on the node classification tasks and
confirm it in our experiments. It is well known that hyperbolic geometry has a
unique advantage in representing the hierarchical structure of graphs.
Therefore, we attempt to explore the hierarchy-imbalance issue for node
classification of graph neural networks with a novelty perspective of
hyperbolic geometry, including its characteristics and causes. Then, we propose
a novel hyperbolic geometric hierarchy-imbalance learning framework, named
HyperIMBA, to alleviate the hierarchy-imbalance issue caused by uneven
hierarchy-levels and cross-hierarchy connectivity patterns of labeled
nodes.Extensive experimental results demonstrate the superior effectiveness of
HyperIMBA for hierarchy-imbalance node classification tasks.Comment: Accepted by Web Conference (WWW) 202
Environment-Aware Dynamic Graph Learning for Out-of-Distribution Generalization
Dynamic graph neural networks (DGNNs) are increasingly pervasive in
exploiting spatio-temporal patterns on dynamic graphs. However, existing works
fail to generalize under distribution shifts, which are common in real-world
scenarios. As the generation of dynamic graphs is heavily influenced by latent
environments, investigating their impacts on the out-of-distribution (OOD)
generalization is critical. However, it remains unexplored with the following
two major challenges: (1) How to properly model and infer the complex
environments on dynamic graphs with distribution shifts? (2) How to discover
invariant patterns given inferred spatio-temporal environments? To solve these
challenges, we propose a novel Environment-Aware dynamic Graph LEarning (EAGLE)
framework for OOD generalization by modeling complex coupled environments and
exploiting spatio-temporal invariant patterns. Specifically, we first design
the environment-aware EA-DGNN to model environments by multi-channel
environments disentangling. Then, we propose an environment instantiation
mechanism for environment diversification with inferred distributions. Finally,
we discriminate spatio-temporal invariant patterns for out-of-distribution
prediction by the invariant pattern recognition mechanism and perform
fine-grained causal interventions node-wisely with a mixture of instantiated
environment samples. Experiments on real-world and synthetic dynamic graph
datasets demonstrate the superiority of our method against state-of-the-art
baselines under distribution shifts. To the best of our knowledge, we are the
first to study OOD generalization on dynamic graphs from the environment
learning perspective.Comment: Accepted by the 37th Conference on Neural Information Processing
Systems (NeurIPS 2023
- …