Search CORE

28 research outputs found

ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning

Author: Olsson Viktor
Pinto Juliano
Svensson Lennart
Tranheden Wilhelm
Publication venue
Publication date: 29/11/2020
Field of study

The state of the art in semantic segmentation is steadily increasing in performance, resulting in more precise and reliable segmentations in many different applications. However, progress is limited by the cost of generating labels for training, which sometimes requires hours of manual labor for a single image. Because of this, semi-supervised methods have been applied to this task, with varying degrees of success. A key challenge is that common augmentations used in semi-supervised classification are less effective for semantic segmentation. We propose a novel data augmentation mechanism called ClassMix, which generates augmentations by mixing unlabelled samples, by leveraging on the network's predictions for respecting object boundaries. We evaluate this augmentation technique on two common semi-supervised semantic segmentation benchmarks, showing that it attains state-of-the-art results. Lastly, we also provide extensive ablation studies comparing different design decisions and training regimes.Comment: This paper has been accepted to WACV202

arXiv.org e-Print Archive

Chalmers Research

Estimating small differences in car-pose from orbits

Author: Kicanaoglu B.
Smeulders A.W.M.
Tao R.
Publication venue: BMVA Press
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

A model-agnostic approach for generating Saliency Maps to explain inferred decisions of Deep Learning Models

Author: Kamilaris Andreas
Karatsiolis Savvas
Publication venue: ArXiv.org
Publication date: 19/09/2022
Field of study

The widespread use of black-box AI models has raised the need for algorithms and methods that explain the decisions made by these models. In recent years, the AI research community is increasingly interested in models' explainability since black-box models take over more and more complicated and challenging tasks. Explainability becomes critical considering the dominance of deep learning techniques for a wide range of applications, including but not limited to computer vision. In the direction of understanding the inference process of deep learning models, many methods that provide human comprehensible evidence for the decisions of AI models have been developed, with the vast majority relying their operation on having access to the internal architecture and parameters of these models (e.g., the weights of neural networks). We propose a model-agnostic method for generating saliency maps that has access only to the output of the model and does not require additional information such as gradients. We use Differential Evolution (DE) to identify which image pixels are the most influential in a model's decision-making process and produce class activation maps (CAMs) whose quality is comparable to the quality of CAMs created with model-specific algorithms. DE-CAM achieves good performance without requiring access to the internal details of the model's architecture at the cost of more computational complexity

University of Twente Research Information

Three for one and one for three: Flow, Segmentation, and Surface Normals

Author: Baslamisli A.S.
Gevers T.
Le H.-A.
Mensink T.
Publication venue: BMVA Press
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and Motion Estimation

Author: Cheng Ta-Ying
He Yuhang
Lu Kai
Markham Andrew
Trigoni Niki
Zhong Jia-Xing
Zhou Kaichen
Publication venue
Publication date: 08/06/2023
Field of study

A truly generalizable approach to rigid segmentation and motion estimation is fundamental to 3D understanding of articulated objects and moving scenes. In view of the tightly coupled relationship between segmentation and motion estimates, we present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner. Our architecture comprises two lightweight and inter-connected heads that predict segmentation masks using point-level invariant features and motion estimates from SE(3) equivariant features without the prerequisites of category information. Our unified training strategy can be performed online while jointly optimizing the two predictions by exploiting the interrelations among scene flow, segmentation mask, and rigid transformations. We show experiments on four datasets as evidence of the superiority of our method both in terms of model performance and computational efficiency with only 0.25M parameters and 0.92G FLOPs. To the best of our knowledge, this is the first work designed for category-agnostic part-level SE(3) equivariance in dynamic point clouds

arXiv.org e-Print Archive

CAT: Learning to Collaborate Channel and Spatial Attention from Multi-Information Fusion

Author: Huang Keke
Li Yuchen
Sun Weiwei
Wang Fan
Wang Man
Wu Zizhang
Xu Tianhao
Publication venue
Publication date: 12/12/2022
Field of study

Channel and spatial attention mechanism has proven to provide an evident performance boost of deep convolution neural networks (CNNs). Most existing methods focus on one or run them parallel (series), neglecting the collaboration between the two attentions. In order to better establish the feature interaction between the two types of attention, we propose a plug-and-play attention module, which we term "CAT"-activating the Collaboration between spatial and channel Attentions based on learned Traits. Specifically, we represent traits as trainable coefficients (i.e., colla-factors) to adaptively combine contributions of different attention modules to fit different image hierarchies and tasks better. Moreover, we propose the global entropy pooling (GEP) apart from global average pooling (GAP) and global maximum pooling (GMP) operators, an effective component in suppressing noise signals by measuring the information disorder of feature maps. We introduce a three-way pooling operation into attention modules and apply the adaptive mechanism to fuse their outcomes. Extensive experiments on MS COCO, Pascal-VOC, Cifar-100, and ImageNet show that our CAT outperforms existing state-of-the-art attention mechanisms in object detection, instance segmentation, and image classification. The model and code will be released soon.Comment: 8 pages, 5 figure

arXiv.org e-Print Archive

Digitale Bibliothek Braunschweig

DACS: Domain Adaptation via Cross-domain Mixed Sampling

Author: Olsson Viktor
Pinto Juliano
Svensson Lennart
Tranheden Wilhelm
Publication venue
Publication date: 29/11/2020
Field of study

Semantic segmentation models based on convolutional neural networks have recently displayed remarkable performance for a multitude of applications. However, these models typically do not generalize well when applied on new domains, especially when going from synthetic to real data. In this paper we address the problem of unsupervised domain adaptation (UDA), which attempts to train on labelled data from one domain (source domain), and simultaneously learn from unlabelled data in the domain of interest (target domain). Existing methods have seen success by training on pseudo-labels for these unlabelled images. Multiple techniques have been proposed to mitigate low-quality pseudo-labels arising from the domain shift, with varying degrees of success. We propose DACS: Domain Adaptation via Cross-domain mixed Sampling, which mixes images from the two domains along with the corresponding labels and pseudo-labels. These mixed samples are then trained on, in addition to the labelled data itself. We demonstrate the effectiveness of our solution by achieving state-of-the-art results for GTA5 to Cityscapes, a common synthetic-to-real semantic segmentation benchmark for UDA.Comment: This paper has been accepted to WACV202

arXiv.org e-Print Archive

Chalmers Research