8,251 research outputs found
Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval
Humans use context and scene knowledge to easily localize moving objects in
conditions of complex illumination changes, scene clutter and occlusions. In
this paper, we present a method to leverage human knowledge in the form of
annotated video libraries in a novel search and retrieval based setting to
track objects in unseen video sequences. For every video sequence, a document
that represents motion information is generated. Documents of the unseen video
are queried against the library at multiple scales to find videos with similar
motion characteristics. This provides us with coarse localization of objects in
the unseen video. We further adapt these retrieved object locations to the new
video using an efficient warping scheme. The proposed method is validated on
in-the-wild video surveillance datasets where we outperform state-of-the-art
appearance-based trackers. We also introduce a new challenging dataset with
complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for
Video Technolog
Automatic Action Annotation in Weakly Labeled Videos
Manual spatio-temporal annotation of human action in videos is laborious,
requires several annotators and contains human biases. In this paper, we
present a weakly supervised approach to automatically obtain spatio-temporal
annotations of an actor in action videos. We first obtain a large number of
action proposals in each video. To capture a few most representative action
proposals in each video and evade processing thousands of them, we rank them
using optical flow and saliency in a 3D-MRF based framework and select a few
proposals using MAP based proposal subset selection method. We demonstrate that
this ranking preserves the high quality action proposals. Several such
proposals are generated for each video of the same action. Our next challenge
is to iteratively select one proposal from each video so that all proposals are
globally consistent. We formulate this as Generalized Maximum Clique Graph
problem using shape, global and fine grained similarity of proposals across the
videos. The output of our method is the most action representative proposals
from each video. Our method can also annotate multiple instances of the same
action in a video. We have validated our approach on three challenging action
datasets: UCF Sport, sub-JHMDB and THUMOS'13 and have obtained promising
results compared to several baseline methods. Moreover, on UCF Sports, we
demonstrate that action classifiers trained on these automatically obtained
spatio-temporal annotations have comparable performance to the classifiers
trained on ground truth annotation
STV-based Video Feature Processing for Action Recognition
In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end
Interactive Perception for Cluttered Environments
Robotics research tends to focus upon either non-contact sensing or machine manipulation, but not both. This paper explores the benefits of combining the two by addressing the problem of extracting and classifying unknown objects within a cluttered environment, such as found in recycling and service robot applications. In the proposed approach, a pile of objects lies on a flat background, and the goal of the robot is to sift through the pile and classify each object so that it can be studied further. One object should be removed at a time with minimal disturbance to the other objects. We propose an algorithm, based upon graph-based segmentation and stereo matching, that automatically computes a desired grasp point that enables the objects to be removed one at a time. The algorithm then isolates each object to be classified by color, shape and flexibility. Experiments on a number of different objects demonstrate the ability of classifying each item through interaction and labeling them for further use and study
Mathematics and Morphogenesis of the City: A Geometrical Approach
Cities are living organisms. They are out of equilibrium, open systems that
never stop developing and sometimes die. The local geography can be compared to
a shell constraining its development. In brief, a city's current layout is a
step in a running morphogenesis process. Thus cities display a huge diversity
of shapes and none of traditional models from random graphs, complex networks
theory or stochastic geometry takes into account geometrical, functional and
dynamical aspects of a city in the same framework. We present here a global
mathematical model dedicated to cities that permits describing, manipulating
and explaining cities' overall shape and layout of their street systems. This
street-based framework conciliates the topological and geometrical sides of the
problem. From the static analysis of several French towns (topology of first
and second order, anisotropy, streets scaling) we make the hypothesis that the
development of a city follows a logic of division / extension of space. We
propose a dynamical model that mimics this logic and which from simple general
rules and a few parameters succeeds in generating a large diversity of cities
and in reproducing the general features the static analysis has pointed out.Comment: 13 pages, 13 figure
- …