46 research outputs found
Augmenting Information Propagation Models with Graph Neural Networks
Department of Computer Science and EngineeringConventional epidemic models are limited in their ability to capture the dynamics of real world epidemics in a sense that they either place restrictions on the models such as their topology and contact process for mathematical tractability, or focus only on the average global behavior, which lacks details for further analysis. We propose a novel modeling approach that augments the conventional epidemic models using Graph Neural Networks to improve their expressive power while preserving the useful mathematical structures. Simulation results show that our proposed model can predict spread times in both node-level and network-wide perspectives with high accuracy having median relative errors below 15% for a wide range of scenarios.ope
Contrastive Vicinal Space for Unsupervised Domain Adaptation
Recent unsupervised domain adaptation methods have utilized vicinal space
between the source and target domains. However, the equilibrium collapse of
labels, a problem where the source labels are dominant over the target labels
in the predictions of vicinal instances, has never been addressed. In this
paper, we propose an instance-wise minimax strategy that minimizes the entropy
of high uncertainty instances in the vicinal space to tackle the stated
problem. We divide the vicinal space into two subspaces through the solution of
the minimax problem: contrastive space and consensus space. In the contrastive
space, inter-domain discrepancy is mitigated by constraining instances to have
contrastive views and labels, and the consensus space reduces the confusion
between intra-domain categories. The effectiveness of our method is
demonstrated on public benchmarks, including Office-31, Office-Home, and
VisDA-C, achieving state-of-the-art performances. We further show that our
method outperforms the current state-of-the-art methods on PACS, which
indicates that our instance-wise approach works well for multi-source domain
adaptation as well. Code is available at https://github.com/NaJaeMin92/CoVi.Comment: 10 pages, 7 figures, 5 table
Self-Supervised Visual Learning by Variable Playback Speeds Prediction of a Video
We propose a self-supervised visual learning method by predicting the
variable playback speeds of a video. Without semantic labels, we learn the
spatio-temporal visual representation of the video by leveraging the variations
in the visual appearance according to different playback speeds under the
assumption of temporal coherence. To learn the spatio-temporal visual
variations in the entire video, we have not only predicted a single playback
speed but also generated clips of various playback speeds and directions with
randomized starting points. Hence the visual representation can be successfully
learned from the meta information (playback speeds and directions) of the
video. We also propose a new layer dependable temporal group normalization
method that can be applied to 3D convolutional networks to improve the
representation learning performance where we divide the temporal features into
several groups and normalize each one using the different corresponding
parameters. We validate the effectiveness of our method by fine-tuning it to
the action recognition and video retrieval tasks on UCF-101 and HMDB-51.Comment: Accepted by IEEE Access on May 19, 202
Switching Temporary Teachers for Semi-Supervised Semantic Segmentation
The teacher-student framework, prevalent in semi-supervised semantic
segmentation, mainly employs the exponential moving average (EMA) to update a
single teacher's weights based on the student's. However, EMA updates raise a
problem in that the weights of the teacher and student are getting coupled,
causing a potential performance bottleneck. Furthermore, this problem may
become more severe when training with more complicated labels such as
segmentation masks but with few annotated data. This paper introduces Dual
Teacher, a simple yet effective approach that employs dual temporary teachers
aiming to alleviate the coupling problem for the student. The temporary
teachers work in shifts and are progressively improved, so consistently prevent
the teacher and student from becoming excessively close. Specifically, the
temporary teachers periodically take turns generating pseudo-labels to train a
student model and maintain the distinct characteristics of the student model
for each epoch. Consequently, Dual Teacher achieves competitive performance on
the PASCAL VOC, Cityscapes, and ADE20K benchmarks with remarkably shorter
training times than state-of-the-art methods. Moreover, we demonstrate that our
approach is model-agnostic and compatible with both CNN- and Transformer-based
models. Code is available at \url{https://github.com/naver-ai/dual-teacher}.Comment: NeurIPS-202
BackTrack: Robust template update via Backward Tracking of candidate template
Variations of target appearance such as deformations, illumination variance,
occlusion, etc., are the major challenges of visual object tracking that
negatively impact the performance of a tracker. An effective method to tackle
these challenges is template update, which updates the template to reflect the
change of appearance in the target object during tracking. However, with
template updates, inadequate quality of new templates or inappropriate timing
of updates may induce a model drift problem, which severely degrades the
tracking performance. Here, we propose BackTrack, a robust and reliable method
to quantify the confidence of the candidate template by backward tracking it on
the past frames. Based on the confidence score of candidates from BackTrack, we
can update the template with a reliable candidate at the right time while
rejecting unreliable candidates. BackTrack is a generic template update scheme
and is applicable to any template-based trackers. Extensive experiments on
various tracking benchmarks verify the effectiveness of BackTrack over existing
template update algorithms, as it achieves SOTA performance on various tracking
benchmarks.Comment: 14 pages, 7 figure
Robust Discriminative Metric Learning for Image Representation
Metric learning has attracted significant attentions in the past decades, for the appealing advances in various realworld applications such as person re-identification and face recognition. Traditional supervised metric learning attempts to seek a discriminative metric, which could minimize the pairwise distance of within-class data samples, while maximizing the pairwise distance of data samples from various classes. However, it is still a challenge to build a robust and discriminative metric, especially for corrupted data in the real-world application. In this paper, we propose a Robust Discriminative Metric Learning algorithm (RDML) via fast low-rank representation and denoising strategy. To be specific, the metric learning problem is guided by a discriminative regularization by incorporating the pair-wise or class-wise information. Moreover, low-rank basis learning is jointly optimized with the metric to better uncover the global data structure and remove noise. Furthermore, fast low-rank representation is implemented to mitigate the computational burden and make sure the scalability on large-scale datasets. Finally, we evaluate our learned metric on several challenging tasks, e.g., face recognition/verification, object recognition, and image clustering. The experimental results verify the effectiveness of the proposed algorithm by comparing to many metric learning algorithms, even deep learning ones