1,714 research outputs found
Real-time marker-less multi-person 3D pose estimation in RGB-Depth camera networks
This paper proposes a novel system to estimate and track the 3D poses of
multiple persons in calibrated RGB-Depth camera networks. The multi-view 3D
pose of each person is computed by a central node which receives the
single-view outcomes from each camera of the network. Each single-view outcome
is computed by using a CNN for 2D pose estimation and extending the resulting
skeletons to 3D by means of the sensor depth. The proposed system is
marker-less, multi-person, independent of background and does not make any
assumption on people appearance and initial pose. The system provides real-time
outcomes, thus being perfectly suited for applications requiring user
interaction. Experimental results show the effectiveness of this work with
respect to a baseline multi-view approach in different scenarios. To foster
research and applications based on this work, we released the source code in
OpenPTrack, an open source project for RGB-D people tracking.Comment: Submitted to the 2018 IEEE International Conference on Robotics and
Automatio
A multi-viewpoint feature-based re-identification system driven by skeleton keypoints
Thanks to the increasing popularity of 3D sensors, robotic vision has experienced huge improvements in a wide range of applications and systems in the last years. Besides the many benefits, this migration caused some incompatibilities with those systems that cannot be based on range sensors, like intelligent video surveillance systems, since the two kinds of sensor data lead to different representations of people and objects. This work goes in the direction of bridging the gap, and presents a novel re-identification system that takes advantage of multiple video flows in order to enhance the performance of a skeletal tracking algorithm, which is in turn exploited for driving the re-identification. A new, geometry-based method for joining together the detections provided by the skeletal tracker from multiple video flows is introduced, which is capable of dealing with many people in the scene, coping with the errors introduced in each view by the skeletal tracker. Such method has a high degree of generality, and can be applied to any kind of body pose estimation algorithm. The system was tested on a public dataset for video surveillance applications, demonstrating the improvements achieved by the multi-viewpoint approach in the accuracy of both body pose estimation and re-identification. The proposed approach was also compared with a skeletal tracking system working on 3D data: the comparison assessed the good performance level of the multi-viewpoint approach. This means that the lack of the rich information provided by 3D sensors can be compensated by the availability of more than one viewpoint
A distributed camera system for multi-resolution surveillance
We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor.
Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database.
Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table.
We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating
under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance
A Novel Method for Extrinsic Calibration of Multiple RGB-D Cameras Using Descriptor-Based Patterns
This letter presents a novel method to estimate the relative poses between
RGB-D cameras with minimal overlapping fields of view in a panoramic RGB-D
camera system. This calibration problem is relevant to applications such as
indoor 3D mapping and robot navigation that can benefit from a 360
field of view using RGB-D cameras. The proposed approach relies on
descriptor-based patterns to provide well-matched 2D keypoints in the case of a
minimal overlapping field of view between cameras. Integrating the matched 2D
keypoints with corresponding depth values, a set of 3D matched keypoints are
constructed to calibrate multiple RGB-D cameras. Experiments validated the
accuracy and efficiency of the proposed calibration approach, both superior to
those of existing methods (800 ms vs. 5 seconds; rotation error of 0.56 degrees
vs. 1.6 degrees; and translation error of 1.80 cm vs. 2.5 cm.Comment: 6 pages, 7 figures, under review by IEEE Robotics and Automation
Letters & ICR
CaLib: Simple and Accurate LiDAR-RGB Calibration using Small Common Markers
In many fields of robotics, knowing the relative position and orientation
between two sensors is a mandatory precondition to operate with multiple
sensing modalities. In this context, the pair LiDAR-RGB cameras offer
complementary features: LiDARs yield sparse high quality range measurements,
while RGB cameras provide a dense color measurement of the environment.
Existing techniques often rely either on complex calibration targets that are
expensive to obtain, or extracted virtual correspondences that can hinder the
estimate's accuracy. In this paper we address the problem of LiDAR-RGB
calibration using typical calibration patterns (i.e. A3 chessboard) with
minimal human intervention. Our approach exploits the planarity of the target
to find correspondences between the sensors measurements, leading to features
that are robust to LiDAR noise.
Moreover, we estimate a solution by solving a joint non-linear optimization
problem. We validated our approach by carrying on quantitative and comparative
experiments with other state-of-the-art approaches. Our results show that our
simple schema performs on par or better than other approches using complex
calibration targets. Finally, we release an open-source C++ implementation at
\url{https://github.com/srrg-sapienza/ca2lib}Comment: 7 pages, 10 figure
Hybrid Focal Stereo Networks for Pattern Analysis in Homogeneous Scenes
In this paper we address the problem of multiple camera calibration in the
presence of a homogeneous scene, and without the possibility of employing
calibration object based methods. The proposed solution exploits salient
features present in a larger field of view, but instead of employing active
vision we replace the cameras with stereo rigs featuring a long focal analysis
camera, as well as a short focal registration camera. Thus, we are able to
propose an accurate solution which does not require intrinsic variation models
as in the case of zooming cameras. Moreover, the availability of the two views
simultaneously in each rig allows for pose re-estimation between rigs as often
as necessary. The algorithm has been successfully validated in an indoor
setting, as well as on a difficult scene featuring a highly dense pilgrim crowd
in Makkah.Comment: 13 pages, 6 figures, submitted to Machine Vision and Application
- âŠ