60 research outputs found

    Visual Object Tracking: The Initialisation Problem

    Get PDF
    Model initialisation is an important component of object tracking. Tracking algorithms are generally provided with the first frame of a sequence and a bounding box (BB) indicating the location of the object. This BB may contain a large number of background pixels in addition to the object and can lead to parts-based tracking algorithms initialising their object models in background regions of the BB. In this paper, we tackle this as a missing labels problem, marking pixels sufficiently away from the BB as belonging to the background and learning the labels of the unknown pixels. Three techniques, One-Class SVM (OC-SVM), Sampled-Based Background Model (SBBM) (a novel background model based on pixel samples), and Learning Based Digital Matting (LBDM), are adapted to the problem. These are evaluated with leave-one-video-out cross-validation on the VOT2016 tracking benchmark. Our evaluation shows both OC-SVMs and SBBM are capable of providing a good level of segmentation accuracy but are too parameter-dependent to be used in real-world scenarios. We show that LBDM achieves significantly increased performance with parameters selected by cross validation and we show that it is robust to parameter variation.Comment: 15th Conference on Computer and Robot Vision (CRV 2018). Source code available at https://github.com/georgedeath/initialisation-proble

    Object segmentation from low depth of field images and video sequences

    Get PDF
    This thesis addresses the problem of autonomous object segmentation. To do so the proposed segementation method uses some prior information, namely that the image to be segmented will have a low depth of field and that the object of interest will be more in focus than the background. To differentiate the object from the background scene, a multiscale wavelet based assessment is proposed. The focus assessment is used to generate a focus intensity map, and a sparse fields level set implementation of active contours is used to segment the object of interest. The initial contour is generated using a grid based technique. The method is extended to segment low depth of field video sequences with each successive initialisation for the active contours generated from the binary dilation of the previous frame's segmentation. Experimental results show good segmentations can be achieved with a variety of different images, video sequences, and objects, with no user interaction or input. The method is applied to two different areas. In the first the segmentations are used to automatically generate trimaps for use with matting algorithms. In the second, the method is used as part of a shape from silhouettes 3D object reconstruction system, replacing the need for a constrained background when generating silhouettes. In addition, not using a thresholding to perform the silhouette segmentation allows for objects with dark components or areas to be segmented accurately. Some examples of 3D models generated using silhouettes are shown

    Object Tracking in Video with Part-Based Tracking by Feature Sampling

    Get PDF
    Visual tracking of arbitrary objects is an active research topic in computer vision, with applications across multiple disciplines including video surveillance, activity analysis, robot vision, and human computer interface. Despite great progress having been made in object tracking in recent years, it still remains a challenge to design trackers that can deal with difficult tracking scenarios, such as camera motion, object motion change, occlusion, illumination changes, and object deformation. A promising way of tackling these types of problems is to use a part-based method; one which models and tracks small regions of the object and estimates the location of the object based on the tracked part's positions. These approaches typically model parts of objects with histograms of various hand-crafted features extracted from the region in which the part is located. However, it is unclear how such relatively homogeneous regions should be represented to form an effective part-based tracker. In this thesis we present a part-based tracker that includes a model for object parts that is designed to empirically characterise the underlying colour distribution of an image region, representing it by pairs of randomly selected colour features and counts of how many pixels are similar to each feature. This novel feature representation is used to find probable locations for the part in future frames via a Bhattacharyya Distance-based metric, which is modified to prefer higher quality matches. Sets of candidate patch locations are generated by randomly generating non-shearing affine transformations of the part's previous locations and locally optimising the most likely sets of parts to allow for small intra-frame object deformations. We also present a study of model initialisation in online, model-free tracking and evaluate several techniques for selecting the regions of an image, given a target bounding box most likely to contain an object. The strengths and limitations of the combined tracker are evaluated on the VOT2016 and VOT2018 datasets using their evaluation protocol, which also allows an extensive evaluation of parameter robustness. The presented tracker is ranked first among part-based trackers on the VOT2018 dataset and is particularly robust to changes in object and camera motion, as well as object size changes

    A discrete graph Laplacian for signal processing

    Get PDF
    In this thesis we exploit diffusion processes on graphs to effect two fundamental problems of image processing: denoising and segmentation. We treat these two low-level vision problems on the pixel-wise level under a unified framework: a graph embedding. Using this framework opens us up to the possibilities of exploiting recently introduced algorithms from the semi-supervised machine learning literature. We contribute two novel edge-preserving smoothing algorithms to the literature. Furthermore we apply these edge-preserving smoothing algorithms to some computational photography tasks. Many recent computational photography tasks require the decomposition of an image into a smooth base layer containing large scale intensity variations and a residual layer capturing fine details. Edge-preserving smoothing is the main computational mechanism in producing these multi-scale image representations. We, in effect, introduce a new approach to edge-preserving multi-scale image decompositions. Where as prior approaches such as the Bilateral filter and weighted-least squares methods require multiple parameters to tune the response of the filters our method only requires one. This parameter can be interpreted as a scale parameter. We demonstrate the utility of our approach by applying the method to computational photography tasks that utilise multi-scale image decompositions. With minimal modification to these edge-preserving smoothing algorithms we show that we can extend them to produce interactive image segmentation. As a result the operations of segmentation and denoising are conducted under a unified framework. Moreover we discuss how our method is related to region based active contours. We benchmark our proposed interactive segmentation algorithms against those based upon energy-minimisation, specifically graph-cut methods. We demonstrate that we achieve competitive performance

    Automatic example-based image colorization using location-aware cross-scale matching

    Get PDF
    Given a reference colour image and a destination grayscale image, this paper presents a novel automatic colourisation algorithm that transfers colour information from the reference image to the destination image. Since the reference and destination images may contain content at different or even varying scales (due to changes of distance between objects and the camera), existing texture matching based methods can often perform poorly. We propose a novel cross-scale texture matching method to improve the robustness and quality of the colourisation results. Suitable matching scales are considered locally, which are then fused using global optimisation that minimises both the matching errors and spatial change of scales. The minimisation is efficiently solved using a multi-label graph-cut algorithm. Since only low-level texture features are used, texture matching based colourisation can still produce semantically incorrect results, such as meadow appearing above the sky. We consider a class of semantic violation where the statistics of up-down relationships learnt from the reference image are violated and propose an effective method to identify and correct unreasonable colourisation. Finally, a novel nonlocal β„“1 optimisation framework is developed to propagate high confidence micro-scribbles to regions of lower confidence to produce a fully colourised image. Qualitative and quantitative evaluations show that our method outperforms several state-of-the-art methods

    PatchMatch Belief Propagation for Correspondence Field Estimation and its Applications

    Get PDF
    Correspondence fields estimation is an important process that lies at the core of many different applications. Is it often seen as an energy minimisation problem, which is usually decomposed into the combined minimisation of two energy terms. The first is the unary energy, or data term, which reflects how well the solution agrees with the data. The second is the pairwise energy, or smoothness term, and ensures that the solution displays a certain level of smoothness, which is crucial for many applications. This thesis explores the possibility of combining two well-established algorithms for correspondence field estimation, PatchMatch and Belief Propagation, in order to benefit from the strengths of both and overcome some of their weaknesses. Belief Propagation is a common algorithm that can be used to optimise energies comprising both unary and pairwise terms. It is however computational expensive and thus not adapted to continuous spaces which are often needed in imaging applications. On the other hand, PatchMatch is a simple, yet very efficient method for optimising the unary energy of such problems on continuous and high dimensional spaces. The algorithm has two main components: the update of the solution space by sampling and the use of the spatial neighbourhood to propagate samples. We show how these components are related to the components of a specific form of Belief Propagation, called Particle Belief Propagation (PBP). PatchMatch however suffers from the lack of an explicit smoothness term. We show that unifying the two approaches yields a new algorithm, PMBP, which has improved performance compared to PatchMatch and is orders of magnitude faster than PBP. We apply our new optimiser to two different applications: stereo matching and optical flow. We validate the benefits of PMBP through series of experiments and show that we consistently obtain lower errors than both PatchMatch and Belief Propagation

    Layered Scene Models from Single Hazy Images

    Get PDF
    • …