8,148 research outputs found
Recommended from our members
Multitarget Tracking in Nonoverlapping Cameras Using a Reference Set
Tracking multiple targets in nonoverlapping cameras are challenging since the observations of the same targets are often separated by time and space. There might be significant appearance change of a target across camera views caused by variations in illumination conditions, poses, and camera imaging characteristics. Consequently, the same target may appear very different in two cameras. Therefore, associating tracks in different camera views directly based on their appearance similarity is difficult and prone to error. In most previous methods, the appearance similarity is computed either using color histograms or based on pretrained brightness transfer function that maps color between cameras. In this paper, a novel reference set based appearance model is proposed to improve multitarget tracking in a network of nonoverlapping cameras. Contrary to previous work, a reference set is constructed for a pair of cameras, containing subjects appearing in both camera views. For track association, instead of directly comparing the appearance of two targets in different camera views, they are compared indirectly via the reference set. Besides global color histograms, texture and shape features are extracted at different locations of a target, and AdaBoost is used to learn the discriminative power of each feature. The effectiveness of the proposed method over the state of the art on two challenging real-world multicamera video data sets is demonstrated by thorough experiments
Articulated Clinician Detection Using 3D Pictorial Structures on RGB-D Data
Reliable human pose estimation (HPE) is essential to many clinical
applications, such as surgical workflow analysis, radiation safety monitoring
and human-robot cooperation. Proposed methods for the operating room (OR) rely
either on foreground estimation using a multi-camera system, which is a
challenge in real ORs due to color similarities and frequent illumination
changes, or on wearable sensors or markers, which are invasive and therefore
difficult to introduce in the room. Instead, we propose a novel approach based
on Pictorial Structures (PS) and on RGB-D data, which can be easily deployed in
real ORs. We extend the PS framework in two ways. First, we build robust and
discriminative part detectors using both color and depth images. We also
present a novel descriptor for depth images, called histogram of depth
differences (HDD). Second, we extend PS to 3D by proposing 3D pairwise
constraints and a new method that makes exact inference tractable. Our approach
is evaluated for pose estimation and clinician detection on a challenging RGB-D
dataset recorded in a busy operating room during live surgeries. We conduct
series of experiments to study the different part detectors in conjunction with
the various 2D or 3D pairwise constraints. Our comparisons demonstrate that 3D
PS with RGB-D part detectors significantly improves the results in a visually
challenging operating environment.Comment: The supplementary video is available at https://youtu.be/iabbGSqRSg
Harvesting Multiple Views for Marker-less 3D Human Pose Annotations
Recent advances with Convolutional Networks (ConvNets) have shifted the
bottleneck for many computer vision tasks to annotated data collection. In this
paper, we present a geometry-driven approach to automatically collect
annotations for human pose prediction tasks. Starting from a generic ConvNet
for 2D human pose, and assuming a multi-view setup, we describe an automatic
way to collect accurate 3D human pose annotations. We capitalize on constraints
offered by the 3D geometry of the camera setup and the 3D structure of the
human body to probabilistically combine per view 2D ConvNet predictions into a
globally optimal 3D pose. This 3D pose is used as the basis for harvesting
annotations. The benefit of the annotations produced automatically with our
approach is demonstrated in two challenging settings: (i) fine-tuning a generic
ConvNet-based 2D pose predictor to capture the discriminative aspects of a
subject's appearance (i.e.,"personalization"), and (ii) training a ConvNet from
scratch for single view 3D human pose prediction without leveraging 3D pose
groundtruth. The proposed multi-view pose estimator achieves state-of-the-art
results on standard benchmarks, demonstrating the effectiveness of our method
in exploiting the available multi-view information.Comment: CVPR 2017 Camera Read
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
- âŠ