Search CORE

286 research outputs found

Adaptive appearance learning for visual object tracking

Author: Gu Irene Y.H.
Khan Zulfiqar H.
Publication venue
Publication date: 01/01/2011
Field of study

This paper addresses online learning of reference object distribution in the context of two hybrid tracking schemes that combine the mean shift with local point feature correspondences, and the mean shift under the Bayesian framework, respectively. The reference object distribution is built up by a kernel-weighted color histogram. The main contributions of the proposed schemes includes: (a) an adaptive learning strategy that seeks to update the reference object distribution when the changes are caused by the intrinsic object dynamic without partial occlusion/ intersection; (b) novel dynamic maintenance of object feature points by exploring both foreground and background sets; (c) integration of adaptive appearance and local point features in joint object appearance similarity and local point features correspondences-based tracker to improve [7]; (d) integration of adaptive appearance in joint appearance similarity and particle filter tracker under the Bayesian framework to improve [10]. Experimental results on a range of videos captured by a dynamic/stationary camera demonstrate the effectiveness of the proposed schemes in terms of robustness to partial occlusions, tracking drifts and tightness and accuracy of tracked bounding box. Comparisons are also made with the two hybrid trackers together with 3 existing trackers

Chalmers Research

Chalmers Publication Library

Online Learning and Robust Visual Tracking using Local Features and Global Appearances of Video Objects

Author: Irene Y.H. Gu
Zulfiqar H. Khan
Publication venue: 'IntechOpen'
Publication date: 28/02/2011
Field of study

IntechOpen

Object Tracking

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Object tracking consists in estimation of trajectory of moving objects in the sequence of images. Automation of the computer object tracking is a difficult task. Dynamics of multiple parameters changes representing features and motion of the objects, and temporary partial or full occlusion of the tracked objects have to be considered. This monograph presents the development of object tracking algorithms, methods and systems. Both, state of the art of object tracking methods and also the new trends in research are described in this book. Fourteen chapters are split into two sections. Section 1 presents new theoretical ideas whereas Section 2 presents real-life applications. Despite the variety of topics contained in this monograph it constitutes a consisted knowledge in the field of computer object tracking. The intention of editor was to follow up the very quick progress in the developing of methods as well as extension of the application

Directory of Open Access Books (DOAB)

Image-guided ToF depth upsampling: a survey

Author: A Buades
A Harrison
A Kolb
A Prusak
B Huhle
B Langmann
C Herrera
C Richardt
D Barash
D Fofi
D Min
D Nehab
D Scharstein
Dmitry Chetverikov
E Stoykova
F Remondino
H Guan
Iván Eichhardt
J Carter
J Jung
J Kopf
J Park
J Salvi
J Tian
J Zhu
J Zhu
JD Ouwerkerk Van
JW Weingarten
K Bredies
KP Murphy
L Yin
M Fu
M Gong
M Hansard
M Lindner
M Sonka
MA Fischler
N Pfeifer
P Milanfar
Q Yang
R Fattal
R Szeliski
S Foix
S Fuchs
S Mandal
S Paris
S Schwarz
S Schwarz
SA Guomundsson
SB Gokturk
SM Seitz
SP Awate
U Hahne
U Hahne
V Villena-Martínez
X Xu
Z Zhang
Z Zhang
Zsolt Jankó
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Recently, there has been remarkable growth of interest in the development and applications of time-of-flight (ToF) depth cameras. Despite the permanent improvement of their characteristics, the practical applicability of ToF cameras is still limited by low resolution and quality of depth measurements. This has motivated many researchers to combine ToF cameras with other sensors in order to enhance and upsample depth images. In this paper, we review the approaches that couple ToF depth images with high-resolution optical images. Other classes of upsampling methods are also briefly discussed. Finally, we provide an overview of performance evaluation tests presented in the related studies

Crossref

SZTAKI Publication Repository

Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

Author: Chen Yi
Gu Dongbing
Hu Huosheng
Li Yun
Xu Peter
Zhang Jun
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

No abstract available

Enlighten

On unifying sparsity and geometry for image-based 3D scene representation

Author: Tošic Ivana
Publication venue: Lausanne, EPFL
Publication date: 06/04/2009
Field of study

Demand has emerged for next generation visual technologies that go beyond conventional 2D imaging. Such technologies should capture and communicate all perceptually relevant three-dimensional information about an environment to a distant observer, providing a satisfying, immersive experience. Camera networks offer a low cost solution to the acquisition of 3D visual information, by capturing multi-view images from different viewpoints. However, the camera's representation of the data is not ideal for common tasks such as data compression or 3D scene analysis, as it does not make the 3D scene geometry explicit. Image-based scene representations fundamentally require a multi-view image model that facilitates extraction of underlying geometrical relationships between the cameras and scene components. Developing new, efficient multi-view image models is thus one of the major challenges in image-based 3D scene representation methods. This dissertation focuses on defining and exploiting a new method for multi-view image representation, from which the 3D geometry information is easily extractable, and which is additionally highly compressible. The method is based on sparse image representation using an overcomplete dictionary of geometric features, where a single image is represented as a linear combination of few fundamental image structure features (edges for example). We construct the dictionary by applying a unitary operator to an analytic function, which introduces a composition of geometric transforms (translations, rotation and anisotropic scaling) to that function. The advantage of this approach is that the features across multiple views can be related with a single composition of transforms. We then establish a connection between image components and scene geometry by defining the transforms that satisfy the multi-view geometry constraint, and obtain a new geometric multi-view correlation model. We first address the construction of dictionaries for images acquired by omnidirectional cameras, which are particularly convenient for scene representation due to their wide field of view. Since most omnidirectional images can be uniquely mapped to spherical images, we form a dictionary by applying motions on the sphere, rotations, and anisotropic scaling to a function that lives on the sphere. We have used this dictionary and a sparse approximation algorithm, Matching Pursuit, for compression of omnidirectional images, and additionally for coding 3D objects represented as spherical signals. Both methods offer better rate-distortion performance than state of the art schemes at low bit rates. The novel multi-view representation method and the dictionary on the sphere are then exploited for the design of a distributed coding method for multi-view omnidirectional images. In a distributed scenario, cameras compress acquired images without communicating with each other. Using a reliable model of correlation between views, distributed coding can achieve higher compression ratios than independent compression of each image. However, the lack of a proper model has been an obstacle for distributed coding in camera networks for many years. We propose to use our geometric correlation model for distributed multi-view image coding with side information. The encoder employs a coset coding strategy, developed by dictionary partitioning based on atom shape similarity and multi-view geometry constraints. Our method results in significant rate savings compared to independent coding. An additional contribution of the proposed correlation model is that it gives information about the scene geometry, leading to a new camera pose estimation method using an extremely small amount of data from each camera. Finally, we develop a method for learning stereo visual dictionaries based on the new multi-view image model. Although dictionary learning for still images has received a lot of attention recently, dictionary learning for stereo images has been investigated only sparingly. Our method maximizes the likelihood that a set of natural stereo images is efficiently represented with selected stereo dictionaries, where the multi-view geometry constraint is included in the probabilistic modeling. Experimental results demonstrate that including the geometric constraints in learning leads to stereo dictionaries that give both better distributed stereo matching and approximation properties than randomly selected dictionaries. We show that learning dictionaries for optimal scene representation based on the novel correlation model improves the camera pose estimation and that it can be beneficial for distributed coding

Infoscience - École polytechnique fédérale de Lausanne

Robust real-time tracking in smart camera networks

Author: Jelača Vedran
Publication venue: Ghent University. Faculty of Engineering and Architecture
Publication date: 01/01/2015
Field of study

Ghent University Academic Bibliography

Real-Time RGB-D Camera Pose Estimation in Novel Scenes using a Relocalisation Cascade

Author: Cavallari Tommaso
Di Stefano Luigi
Golodetz Stuart
Lord Nicholas A.
Prisacariu Victor A.
Torr Philip H. S.
Valentin Julien
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

Camera pose estimation is an important problem in computer vision. Common techniques either match the current image against keyframes with known poses, directly regress the pose, or establish correspondences between keypoints in the image and points in the scene to estimate the pose. In recent years, regression forests have become a popular alternative to establish such correspondences. They achieve accurate results, but have traditionally needed to be trained offline on the target scene, preventing relocalisation in new environments. Recently, we showed how to circumvent this limitation by adapting a pre-trained forest to a new scene on the fly. The adapted forests achieved relocalisation performance that was on par with that of offline forests, and our approach was able to estimate the camera pose in close to real time. In this paper, we present an extension of this work that achieves significantly better relocalisation performance whilst running fully in real time. To achieve this, we make several changes to the original approach: (i) instead of accepting the camera pose hypothesis without question, we make it possible to score the final few hypotheses using a geometric approach and select the most promising; (ii) we chain several instantiations of our relocaliser together in a cascade, allowing us to try faster but less accurate relocalisation first, only falling back to slower, more accurate relocalisation as necessary; and (iii) we tune the parameters of our cascade to achieve effective overall performance. These changes allow us to significantly improve upon the performance our original state-of-the-art method was able to achieve on the well-known 7-Scenes and Stanford 4 Scenes benchmarks. As additional contributions, we present a way of visualising the internal behaviour of our forests and show how to entirely circumvent the need to pre-train a forest on a generic scene.Comment: Tommaso Cavallari, Stuart Golodetz, Nicholas Lord and Julien Valentin assert joint first authorshi

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Oxford University Research Archive

Long-range video motion estimation using point trajectories

Author: Sand Peter (Peter M.), 1977-
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/2006
Field of study

Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.Includes bibliographical references (leaves 97-104).This thesis describes a new approach to video motion estimation, in which motion is represented using a set of particles. Each particle is an image point sample with a long-duration trajectory and other properties. To optimize these particles, we measure point-based matching along the particle trajectories and distortion between the particles. The resulting motion representation is useful for a variety of applications and differs from optical flow, feature tracking, and parametric or layer-based models. We demonstrate the algorithm on challenging real-world videos that include complex scene geometry, multiple types of occlusion, regions with low texture, and non-rigid deformation.by Peter Sand.Ph.D

DSpace@MIT