Search CORE

6,876 research outputs found

Comparison of different integral histogram based tracking algorithms

Author: Atluri Sriharsha
Publication venue: LSU Digital Commons
Publication date: 01/01/2011
Field of study

Object tracking is an important subject in computer vision with a wide range of applications – security and surveillance, motion-based recognition, driver assistance systems, and human-computer interaction. The proliferation of high-powered computers, the availability of high quality and inexpensive video cameras, and the increasing need for automated video analysis have generated a great deal of interest in object tracking algorithms. Tracking is usually performed in the context of high-level applications that require the location and/or shape of the object in every frame. Research is being conducted in the development of object tracking algorithms over decades and a number of approaches have been proposed. These approaches differ from each other in object representation, feature selection, and modeling the shape and appearance of the object. Histogram-based tracking has been proved to be an efficient approach in many applications. Integral histogram is a novel method which allows the extraction of histograms of multiple rectangular regions in an image in a very efficient manner. A number of algorithms have used this function in their approaches in the recent years, which made an attempt to use the integral histogram in a more efficient manner. In this paper different algorithms which used this method as a part of their tracking function, are evaluated by comparing their tracking results and an effort is made to modify some of the algorithms for better performance. The sequences used for the tracking experiments are of gray scale (non-colored) and have significant shape and appearance variations for evaluating the performance of the algorithms. Extensive experimental results on these challenging sequences are presented, which demonstrate the tracking abilities of these algorithms

Louisiana State University

ROAM: a Rich Object Appearance Model with Application to Rotoscoping

Author: Miksik Ondrej
Pérez Patrick
Pérez-Rúa Juan-Manuel
Torr Philip H. S.
Publication venue
Publication date: 05/12/2016
Field of study

Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition, editing and manipulation of the segments of interest. Sticking to this prevalent rotoscoping paradigm, we propose a novel framework to capture and track the visual aspect of an arbitrary object in a scene, given a first closed outline of this object. This model combines a collection of local foreground/background appearance models spread along the outline, a global appearance model of the enclosed object and a set of distinctive foreground landmarks. The structure of this rich appearance model allows simple initialization, efficient iterative optimization with exact minimization at each step, and on-line adaptation in videos. We demonstrate qualitatively and quantitatively the merit of this framework through comparisons with tools based on either dynamic segmentation with a closed curve or pixel-wise binary labelling

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Oxford University Research Archive

Learning from minimally labeled data with accelerated convolutional neural networks

Author: Dundar Aysegul
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

The main objective of an Artificial Vision Algorithm is to design a mapping function that takes an image as an input and correctly classifies it into one of the user-determined categories. There are several important properties to be satisfied by the mapping function for visual understanding. First, the function should produce good representations of the visual world, which will be able to recognize images independently of pose, scale and illumination. Furthermore, the designed artificial vision system has to learn these representations by itself. Recent studies on Convolutional Neural Networks (ConvNets) produced promising advancements in visual understanding. These networks attain significant performance upgrades by relying on hierarchical structures inspired by biological vision systems. In my research, I work mainly in two areas: 1) how ConvNets can be programmed to learn the optimal mapping function using the minimum amount of labeled data, and 2) how these networks can be accelerated for practical purposes. In this work, algorithms that learn from unlabeled data are studied. A new framework that exploits unlabeled data is proposed. The proposed framework obtains state-of-the-art performance results in different tasks. Furthermore, this study presents an optimized streaming method for ConvNets’ hardware accelerator on an embedded platform. It is tested on object classification and detection applications using ConvNets. Experimental results indicate high computational efficiency, and significant performance upgrades over all other existing platforms

Purdue E-Pubs

Real-Time 6D Object Pose Estimation on CPU

Author: Hashimoto Manabu
Hattori Kosuke
Konishi Yoshinori
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/08/2019
Field of study

We propose a fast and accurate 6D object pose estimation from a RGB-D image. Our proposed method is template matching based and consists of three main technical components, PCOF-MOD (multimodal PCOF), balanced pose tree (BPT) and optimum memory rearrangement for a coarse-to-fine search. Our model templates on densely sampled viewpoints and PCOF-MOD which explicitly handles a certain range of 3D object pose improve the robustness against background clutters. BPT which is an efficient tree-based data structures for a large number of templates and template matching on rearranged feature maps where nearby features are linearly aligned accelerate the pose estimation. The experimental evaluation on tabletop and bin-picking dataset showed that our method achieved higher accuracy and faster speed in comparison with state-of-the-art techniques including recent CNN based approaches. Moreover, our model templates can be trained only from 3D CAD in a few minutes and the pose estimation run in near real-time (23 fps) on CPU. These features are suitable for any real applications.Comment: accepted to IROS 201

arXiv.org e-Print Archive

Crossref