Search CORE

67 research outputs found

ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning

Author: Olsson Viktor
Pinto Juliano
Svensson Lennart
Tranheden Wilhelm
Publication venue
Publication date: 29/11/2020
Field of study

The state of the art in semantic segmentation is steadily increasing in performance, resulting in more precise and reliable segmentations in many different applications. However, progress is limited by the cost of generating labels for training, which sometimes requires hours of manual labor for a single image. Because of this, semi-supervised methods have been applied to this task, with varying degrees of success. A key challenge is that common augmentations used in semi-supervised classification are less effective for semantic segmentation. We propose a novel data augmentation mechanism called ClassMix, which generates augmentations by mixing unlabelled samples, by leveraging on the network's predictions for respecting object boundaries. We evaluate this augmentation technique on two common semi-supervised semantic segmentation benchmarks, showing that it attains state-of-the-art results. Lastly, we also provide extensive ablation studies comparing different design decisions and training regimes.Comment: This paper has been accepted to WACV202

arXiv.org e-Print Archive

Chalmers Research

RELLISUR: A Real Low-Light Image Super-Resolution Dataset

Author: Aakerberg Andreas
Moeslund Thomas B.
Nasrollahi Kamal
Publication venue
Publication date: 20/08/2021
Field of study

The RELLISUR dataset contains real low-light low-resolution images paired with normal-light high-resolution reference image counterparts. This dataset aims to fill the gap between low-light image enhancement and low-resolution image enhancement (Super-Resolution (SR)) which is currently only being addressed separately in the literature, even though the visibility of real-world images is often limited by both low-light and low-resolution. The dataset contains 12750 paired images of different resolutions and degrees of low-light illumination, to facilitate learning of deep-learning based models that can perform a direct mapping from degraded images with low visibility to high-quality detail rich images of high resolution

VBN

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ReBound: An Open-Source 3D Bounding Box Annotation Tool for Active Learning

Author: Chen Wesley
Edgley Andrew
Hota Raunak
Liu Joshua
Peri Neehar
Purtilo James
Schwartz Ezra
Yizar Aminah
Publication venue
Publication date: 10/03/2023
Field of study

In recent years, supervised learning has become the dominant paradigm for training deep-learning based methods for 3D object detection. Lately, the academic community has studied 3D object detection in the context of autonomous vehicles (AVs) using publicly available datasets such as nuScenes and Argoverse 2.0. However, these datasets may have incomplete annotations, often only labeling a small subset of objects in a scene. Although commercial services exists for 3D bounding box annotation, these are often prohibitively expensive. To address these limitations, we propose ReBound, an open-source 3D visualization and dataset re-annotation tool that works across different datasets. In this paper, we detail the design of our tool and present survey results that highlight the usability of our software. Further, we show that ReBound is effective for exploratory data analysis and can facilitate active-learning. Our code and documentation is available at https://github.com/ajedgley/ReBoundComment: Accepted to CHI 2023 Workshop - Intervening, Teaming, Delegating: Creating Engaging Automation Experiences (AutomationXP

arXiv.org e-Print Archive

Estimating small differences in car-pose from orbits

Author: Kicanaoglu B.
Smeulders A.W.M.
Tao R.
Publication venue: BMVA Press
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

Patch-wise Graph Contrastive Learning for Image Translation

Author: Jung Chanyong
Kwon Gihyun
Ye Jong Chul
Publication venue
Publication date: 19/02/2024
Field of study

Recently, patch-wise contrastive learning is drawing attention for the image translation by exploring the semantic correspondence between the input and output images. To further explore the patch-wise topology for high-level semantic understanding, here we exploit the graph neural network to capture the topology-aware features. Specifically, we construct the graph based on the patch-wise similarity from a pretrained encoder, whose adjacency matrix is shared to enhance the consistency of patch-wise relation between the input and the output. Then, we obtain the node feature from the graph neural network, and enhance the correspondence between the nodes by increasing mutual information using the contrastive loss. In order to capture the hierarchical semantic structure, we further propose the graph pooling. Experimental results demonstrate the state-of-art results for the image translation thanks to the semantic encoding by the constructed graphs.Comment: AAAI 202

arXiv.org e-Print Archive

A model-agnostic approach for generating Saliency Maps to explain inferred decisions of Deep Learning Models

Author: Kamilaris Andreas
Karatsiolis Savvas
Publication venue: ArXiv.org
Publication date: 19/09/2022
Field of study

The widespread use of black-box AI models has raised the need for algorithms and methods that explain the decisions made by these models. In recent years, the AI research community is increasingly interested in models' explainability since black-box models take over more and more complicated and challenging tasks. Explainability becomes critical considering the dominance of deep learning techniques for a wide range of applications, including but not limited to computer vision. In the direction of understanding the inference process of deep learning models, many methods that provide human comprehensible evidence for the decisions of AI models have been developed, with the vast majority relying their operation on having access to the internal architecture and parameters of these models (e.g., the weights of neural networks). We propose a model-agnostic method for generating saliency maps that has access only to the output of the model and does not require additional information such as gradients. We use Differential Evolution (DE) to identify which image pixels are the most influential in a model's decision-making process and produce class activation maps (CAMs) whose quality is comparable to the quality of CAMs created with model-specific algorithms. DE-CAM achieves good performance without requiring access to the internal details of the model's architecture at the cost of more computational complexity

University of Twente Research Information

Three for one and one for three: Flow, Segmentation, and Surface Normals

Author: Baslamisli A.S.
Gevers T.
Le H.-A.
Mensink T.
Publication venue: BMVA Press
Publication date: 01/01/2018
Field of study

International Migration, Integration and Social Cohesion online publications

InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset

Author: Clark Ronald
Huang Yuzhong
Leutenegger Stefan
Li Wenbin
McCormac John
Saeedi Sajad
Tang Rui
Tzoumanikas Dimos
Ye Qing
Publication venue
Publication date: 06/09/2018
Field of study

Datasets have gained an enormous amount of popularity in the computer vision community, from training and evaluation of Deep Learning-based methods to benchmarking Simultaneous Localization and Mapping (SLAM). Without a doubt, synthetic imagery bears a vast potential due to scalability in terms of amounts of data obtainable without tedious manual ground truth annotations or measurements. Here, we present a dataset with the aim of providing a higher degree of photo-realism, larger scale, more variability as well as serving a wider range of purposes compared to existing datasets. Our dataset leverages the availability of millions of professional interior designs and millions of production-level furniture and object assets -- all coming with fine geometric details and high-resolution texture. We render high-resolution and high frame-rate video sequences following realistic trajectories while supporting various camera types as well as providing inertial measurements. Together with the release of the dataset, we will make executable program of our interactive simulator software as well as our renderer available at https://interiornetdataset.github.io. To showcase the usability and uniqueness of our dataset, we show benchmarking results of both sparse and dense SLAM algorithms

OPUS

BlinkFlow: A Dataset to Push the Limits of Event-based Optical Flow Estimation

Author: Bao Hujun
Chen Shuo
Cui Zhaopeng
Huang Zhaoyang
Li Hongsheng
Li Yijin
Shi Xiaoyu
Zhang Guofeng
Publication venue
Publication date: 14/03/2023
Field of study

Event cameras provide high temporal precision, low data rates, and high dynamic range visual perception, which are well-suited for optical flow estimation. While data-driven optical flow estimation has obtained great success in RGB cameras, its generalization performance is seriously hindered in event cameras mainly due to the limited and biased training data. In this paper, we present a novel simulator, BlinkSim, for the fast generation of large-scale data for event-based optical flow. BlinkSim consists of a configurable rendering engine and a flexible engine for event data simulation. By leveraging the wealth of current 3D assets, the rendering engine enables us to automatically build up thousands of scenes with different objects, textures, and motion patterns and render very high-frequency images for realistic event data simulation. Based on BlinkSim, we construct a large training dataset and evaluation benchmark BlinkFlow that contains sufficient, diversiform, and challenging event data with optical flow ground truth. Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40% on average and up to 90%. Moreover, we further propose an Event optical Flow transFormer (E-FlowFormer) architecture. Powered by our BlinkFlow, E-FlowFormer outperforms the SOTA methods by up to 91% on MVSEC dataset and 14% on DSEC dataset and presents the best generalization performance

arXiv.org e-Print Archive