4,198 research outputs found
Real-time marker-less multi-person 3D pose estimation in RGB-Depth camera networks
This paper proposes a novel system to estimate and track the 3D poses of
multiple persons in calibrated RGB-Depth camera networks. The multi-view 3D
pose of each person is computed by a central node which receives the
single-view outcomes from each camera of the network. Each single-view outcome
is computed by using a CNN for 2D pose estimation and extending the resulting
skeletons to 3D by means of the sensor depth. The proposed system is
marker-less, multi-person, independent of background and does not make any
assumption on people appearance and initial pose. The system provides real-time
outcomes, thus being perfectly suited for applications requiring user
interaction. Experimental results show the effectiveness of this work with
respect to a baseline multi-view approach in different scenarios. To foster
research and applications based on this work, we released the source code in
OpenPTrack, an open source project for RGB-D people tracking.Comment: Submitted to the 2018 IEEE International Conference on Robotics and
Automatio
Lidar-based Gait Analysis and Activity Recognition in a 4D Surveillance System
This paper presents new approaches for gait and activity analysis based on data streams of a Rotating Multi Beam (RMB) Lidar sensor. The proposed algorithms are embedded into an integrated 4D vision and visualization system, which is able to analyze and interactively display real scenarios in natural outdoor environments with walking pedestrians. The main focus of the investigations are gait based person re-identification during tracking, and recognition of specific activity patterns such as bending, waving, making phone calls and checking the time looking at wristwatches.
The descriptors for training and recognition are observed and extracted from realistic outdoor surveillance scenarios, where multiple pedestrians are walking in the field of interest following possibly intersecting trajectories, thus the observations might often be affected by occlusions or background noise.
Since there is no public database available for such scenarios, we created and published a new Lidar-based outdoors gait and activity dataset on our website, that contains point cloud sequences of 28 different persons extracted and aggregated from 35 minutes-long measurements.
The presented results confirm that both efficient gait-based identification and activity recognition is achievable in the sparse point clouds of a single RMB Lidar sensor. After extracting the people trajectories, we synthesized a free-viewpoint video, where moving avatar models follow the trajectories of the observed pedestrians in real time, ensuring that the leg movements of the animated avatars are synchronized with the real gait cycles observed in the Lidar stream
Ensemble of Different Approaches for a Reliable Person Re-identification System
An ensemble of approaches for reliable person re-identification is proposed in this paper. The proposed ensemble is built combining widely used person re-identification systems using different color spaces and some variants of state-of-the-art approaches that are proposed in this paper. Different descriptors are tested, and both texture and color features are extracted from the images; then the different descriptors are compared using different distance measures (e.g., the Euclidean distance, angle, and the Jeffrey distance). To improve performance, a method based on skeleton detection, extracted from the depth map, is also applied when the depth map is available. The proposed ensemble is validated on three widely used datasets (CAVIAR4REID, IAS, and VIPeR), keeping the same parameter set of each approach constant across all tests to avoid overfitting and to demonstrate that the proposed system can be considered a general-purpose person re-identification system. Our experimental results show that the proposed system offers significant improvements over baseline approaches. The source code used for the approaches tested in this paper will be available at https://www.dei.unipd.it/node/2357 and http://robotics.dei.unipd.it/reid/
Recurrent Attention Models for Depth-Based Person Identification
We present an attention-based model that reasons on human body shape and
motion dynamics to identify individuals in the absence of RGB information,
hence in the dark. Our approach leverages unique 4D spatio-temporal signatures
to address the identification problem across days. Formulated as a
reinforcement learning task, our model is based on a combination of
convolutional and recurrent neural networks with the goal of identifying small,
discriminative regions indicative of human identity. We demonstrate that our
model produces state-of-the-art results on several published datasets given
only depth images. We further study the robustness of our model towards
viewpoint, appearance, and volumetric changes. Finally, we share insights
gleaned from interpretable 2D, 3D, and 4D visualizations of our model's
spatio-temporal attention.Comment: Computer Vision and Pattern Recognition (CVPR) 201
Self-Supervised Gait Encoding with Locality-Aware Attention for Person Re-Identification
Gait-based person re-identification (Re-ID) is valuable for safety-critical
applications, and using only 3D skeleton data to extract discriminative gait
features for person Re-ID is an emerging open topic. Existing methods either
adopt hand-crafted features or learn gait features by traditional supervised
learning paradigms. Unlike previous methods, we for the first time propose a
generic gait encoding approach that can utilize unlabeled skeleton data to
learn gait representations in a self-supervised manner. Specifically, we first
propose to introduce self-supervision by learning to reconstruct input skeleton
sequences in reverse order, which facilitates learning richer high-level
semantics and better gait representations. Second, inspired by the fact that
motion's continuity endows temporally adjacent skeletons with higher
correlations ("locality"), we propose a locality-aware attention mechanism that
encourages learning larger attention weights for temporally adjacent skeletons
when reconstructing current skeleton, so as to learn locality when encoding
gait. Finally, we propose Attention-based Gait Encodings (AGEs), which are
built using context vectors learned by locality-aware attention, as final gait
representations. AGEs are directly utilized to realize effective person Re-ID.
Our approach typically improves existing skeleton-based methods by 10-20%
Rank-1 accuracy, and it achieves comparable or even superior performance to
multi-modal methods with extra RGB or depth information. Our codes are
available at https://github.com/Kali-Hac/SGE-LA.Comment: Accepted at IJCAI 2020 Main Track. Sole copyright holder is IJCAI.
Codes are available at https://github.com/Kali-Hac/SGE-L
AFFECT-PRESERVING VISUAL PRIVACY PROTECTION
The prevalence of wireless networks and the convenience of mobile cameras enable many new video applications other than security and entertainment. From behavioral diagnosis to wellness monitoring, cameras are increasing used for observations in various educational and medical settings. Videos collected for such applications are considered protected health information under privacy laws in many countries. Visual privacy protection techniques, such as blurring or object removal, can be used to mitigate privacy concern, but they also obliterate important visual cues of affect and social behaviors that are crucial for the target applications. In this dissertation, we propose to balance the privacy protection and the utility of the data by preserving the privacy-insensitive information, such as pose and expression, which is useful in many applications involving visual understanding.
The Intellectual Merits of the dissertation include a novel framework for visual privacy protection by manipulating facial image and body shape of individuals, which: (1) is able to conceal the identity of individuals; (2) provide a way to preserve the utility of the data, such as expression and pose information; (3) balance the utility of the data and capacity of the privacy protection.
The Broader Impacts of the dissertation focus on the significance of privacy protection on visual data, and the inadequacy of current privacy enhancing technologies in preserving affect and behavioral attributes of the visual content, which are highly useful for behavior observation in educational and medical settings. This work in this dissertation represents one of the first attempts in achieving both goals simultaneously
Multi-frame techniques for long-term people re-identification with consumer depth cameras
In this work we address the people re-identification problem with particular attention to long-term re-identification in which a subject has to be re-identified even after some days from the last occurrence. In particular, we focus on multi-frame techniques, which exploit information from multiple frames for producing the re-identification output. We introduced three real-time algorithms which improve classification performance obtained by state-of-the-art algorithms on two public datasets
- …