Search CORE

17,991 research outputs found

Color Constancy Using CNNs

Author: Bianco Simone
Cusano Claudio
Schettini Raimondo
Publication venue
Publication date: 17/04/2015
Field of study

In this work we describe a Convolutional Neural Network (CNN) to accurately predict the scene illumination. Taking image patches as input, the CNN works in the spatial domain without using hand-crafted features that are employed by most previous methods. The network consists of one convolutional layer with max pooling, one fully connected layer and three output nodes. Within the network structure, feature learning and regression are integrated into one optimization process, which leads to a more effective model for estimating scene illumination. This approach achieves state-of-the-art performance on a standard dataset of RAW images. Preliminary experiments on images with spatially varying illumination demonstrate the stability of the local illuminant estimation ability of our CNN.Comment: Accepted at DeepVision: Deep Learning in Computer Vision 2015 (CVPR 2015 workshop

arXiv.org e-Print Archive

BSUV-Net: a fully-convolutional neural network for background subtraction of unseen videos

Author: Ishwar Prakash
Konrad Janusz
Tezcan M Ozan
Publication venue
Publication date: 14/01/2020
Field of study

Background subtraction is a basic task in computer vision and video processing often applied as a pre-processing step for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have been proposed, however nearly all of the top-performing ones are supervised. Crucially, their success relies upon the availability of some annotated frames of the test video during training. Consequently, their performance on completely “unseen” videos is undocumented in the literature. In this work, we propose a new, supervised, background subtraction algorithm for unseen videos (BSUV-Net) based on a fully-convolutional neural network. The input to our network consists of the current frame and two background frames captured at different time scales along with their semantic segmentation maps. In order to reduce the chance of overfitting, we also introduce a new data-augmentation technique which mitigates the impact of illumination difference between the background frames and the current frame. On the CDNet-2014 dataset, BSUV-Net outperforms stateof-the-art algorithms evaluated on unseen videos in terms of several metrics including F-measure, recall and precision.Accepted manuscrip

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)