576 research outputs found
Visual Object Tracking: The Initialisation Problem
Model initialisation is an important component of object tracking. Tracking
algorithms are generally provided with the first frame of a sequence and a
bounding box (BB) indicating the location of the object. This BB may contain a
large number of background pixels in addition to the object and can lead to
parts-based tracking algorithms initialising their object models in background
regions of the BB. In this paper, we tackle this as a missing labels problem,
marking pixels sufficiently away from the BB as belonging to the background and
learning the labels of the unknown pixels. Three techniques, One-Class SVM
(OC-SVM), Sampled-Based Background Model (SBBM) (a novel background model based
on pixel samples), and Learning Based Digital Matting (LBDM), are adapted to
the problem. These are evaluated with leave-one-video-out cross-validation on
the VOT2016 tracking benchmark. Our evaluation shows both OC-SVMs and SBBM are
capable of providing a good level of segmentation accuracy but are too
parameter-dependent to be used in real-world scenarios. We show that LBDM
achieves significantly increased performance with parameters selected by cross
validation and we show that it is robust to parameter variation.Comment: 15th Conference on Computer and Robot Vision (CRV 2018). Source code
available at https://github.com/georgedeath/initialisation-proble
Deep Image Matting: A Comprehensive Survey
Image matting refers to extracting precise alpha matte from natural images,
and it plays a critical role in various downstream applications, such as image
editing. Despite being an ill-posed problem, traditional methods have been
trying to solve it for decades. The emergence of deep learning has
revolutionized the field of image matting and given birth to multiple new
techniques, including automatic, interactive, and referring image matting. This
paper presents a comprehensive review of recent advancements in image matting
in the era of deep learning. We focus on two fundamental sub-tasks: auxiliary
input-based image matting, which involves user-defined input to predict the
alpha matte, and automatic image matting, which generates results without any
manual intervention. We systematically review the existing methods for these
two tasks according to their task settings and network structures and provide a
summary of their advantages and disadvantages. Furthermore, we introduce the
commonly used image matting datasets and evaluate the performance of
representative matting methods both quantitatively and qualitatively. Finally,
we discuss relevant applications of image matting and highlight existing
challenges and potential opportunities for future research. We also maintain a
public repository to track the rapid development of deep image matting at
https://github.com/JizhiziLi/matting-survey
Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation
Natural image matting is an important problem in computer vision and
graphics. It is an ill-posed problem when only an input image is available
without any external information. While the recent deep learning approaches
have shown promising results, they only estimate the alpha matte. This paper
presents a context-aware natural image matting method for simultaneous
foreground and alpha matte estimation. Our method employs two encoder networks
to extract essential information for matting. Particularly, we use a matting
encoder to learn local features and a context encoder to obtain more global
context information. We concatenate the outputs from these two encoders and
feed them into decoder networks to simultaneously estimate the foreground and
alpha matte. To train this whole deep neural network, we employ both the
standard Laplacian loss and the feature loss: the former helps to achieve high
numerical performance while the latter leads to more perceptually plausible
results. We also report several data augmentation strategies that greatly
improve the network's generalization performance. Our qualitative and
quantitative experiments show that our method enables high-quality matting for
a single natural image. Our inference codes and models have been made publicly
available at https://github.com/hqqxyy/Context-Aware-Matting.Comment: This is the camera ready version of ICCV2019 pape
Image-based Material Editing
Photo editing software allows digital images to be blurred, warped or re-colored at the touch of a button. However, it is not currently possible to change the material appearance of an object except by painstakingly painting over the appropriate pixels. Here we present a set of methods for automatically replacing one material with another, completely different material, starting with only a single high dynamic range image, and an alpha matte specifying the object. Our approach exploits the fact that human vision is surprisingly tolerant of certain (sometimes enormous) physical inaccuracies. Thus, it may be possible to produce a visually compelling illusion of material transformations, without fully reconstructing the lighting or geometry. We employ a range of algorithms depending on the target material. First, an approximate depth map is derived from the image intensities using bilateral filters. The resulting surface normals are then used to map data onto the surface of the object to specify its material appearance. To create transparent or translucent materials, the mapped data are derived from the object\u27s background. To create textured materials, the mapped data are a texture map. The surface normals can also be used to apply arbitrary bidirectional reflectance distribution functions to the surface, allowing us to simulate a wide range of materials. To facilitate the process of material editing, we generate the HDR image with a novel algorithm, that is robust against noise in individual exposures. This ensures that any noise, which would possibly have affected the shape recovery of the objects adversely, will be removed. We also present an algorithm to automatically generate alpha mattes. This algorithm requires as input two images--one where the object is in focus, and one where the background is in focus--and then automatically produces an approximate matte, indicating which pixels belong to the object. The result is then improved by a second algorithm to generate an accurate alpha matte, which can be given as input to our material editing techniques
- …