9,159 research outputs found
Selective sampling importance resampling particle filter tracking with multibag subspace restoration
Integration of the 3D Environment for UAV Onboard Visual Object Tracking
Single visual object tracking from an unmanned aerial vehicle (UAV) poses
fundamental challenges such as object occlusion, small-scale objects,
background clutter, and abrupt camera motion. To tackle these difficulties, we
propose to integrate the 3D structure of the observed scene into a
detection-by-tracking algorithm. We introduce a pipeline that combines a
model-free visual object tracker, a sparse 3D reconstruction, and a state
estimator. The 3D reconstruction of the scene is computed with an image-based
Structure-from-Motion (SfM) component that enables us to leverage a state
estimator in the corresponding 3D scene during tracking. By representing the
position of the target in 3D space rather than in image space, we stabilize the
tracking during ego-motion and improve the handling of occlusions, background
clutter, and small-scale objects. We evaluated our approach on prototypical
image sequences, captured from a UAV with low-altitude oblique views. For this
purpose, we adapted an existing dataset for visual object tracking and
reconstructed the observed scene in 3D. The experimental results demonstrate
that the proposed approach outperforms methods using plain visual cues as well
as approaches leveraging image-space-based state estimations. We believe that
our approach can be beneficial for traffic monitoring, video surveillance, and
navigation.Comment: Accepted in MDPI Journal of Applied Science
SALSA: A Novel Dataset for Multimodal Group Behavior Analysis
Studying free-standing conversational groups (FCGs) in unstructured social
settings (e.g., cocktail party ) is gratifying due to the wealth of information
available at the group (mining social networks) and individual (recognizing
native behavioral and personality traits) levels. However, analyzing social
scenes involving FCGs is also highly challenging due to the difficulty in
extracting behavioral cues such as target locations, their speaking activity
and head/body pose due to crowdedness and presence of extreme occlusions. To
this end, we propose SALSA, a novel dataset facilitating multimodal and
Synergetic sociAL Scene Analysis, and make two main contributions to research
on automated social interaction analysis: (1) SALSA records social interactions
among 18 participants in a natural, indoor environment for over 60 minutes,
under the poster presentation and cocktail party contexts presenting
difficulties in the form of low-resolution images, lighting variations,
numerous occlusions, reverberations and interfering sound sources; (2) To
alleviate these problems we facilitate multimodal analysis by recording the
social interplay using four static surveillance cameras and sociometric badges
worn by each participant, comprising the microphone, accelerometer, bluetooth
and infrared sensors. In addition to raw data, we also provide annotations
concerning individuals' personality as well as their position, head, body
orientation and F-formation information over the entire event duration. Through
extensive experiments with state-of-the-art approaches, we show (a) the
limitations of current methods and (b) how the recorded multiple cues
synergetically aid automatic analysis of social interactions. SALSA is
available at http://tev.fbk.eu/salsa.Comment: 14 pages, 11 figure
- …