16,179 research outputs found
A multi-viewpoint feature-based re-identification system driven by skeleton keypoints
Thanks to the increasing popularity of 3D sensors, robotic vision has experienced huge improvements in a wide range of applications and systems in the last years. Besides the many benefits, this migration caused some incompatibilities with those systems that cannot be based on range sensors, like intelligent video surveillance systems, since the two kinds of sensor data lead to different representations of people and objects. This work goes in the direction of bridging the gap, and presents a novel re-identification system that takes advantage of multiple video flows in order to enhance the performance of a skeletal tracking algorithm, which is in turn exploited for driving the re-identification. A new, geometry-based method for joining together the detections provided by the skeletal tracker from multiple video flows is introduced, which is capable of dealing with many people in the scene, coping with the errors introduced in each view by the skeletal tracker. Such method has a high degree of generality, and can be applied to any kind of body pose estimation algorithm. The system was tested on a public dataset for video surveillance applications, demonstrating the improvements achieved by the multi-viewpoint approach in the accuracy of both body pose estimation and re-identification. The proposed approach was also compared with a skeletal tracking system working on 3D data: the comparison assessed the good performance level of the multi-viewpoint approach. This means that the lack of the rich information provided by 3D sensors can be compensated by the availability of more than one viewpoint
A framework for evaluating stereo-based pedestrian detection techniques
Automated pedestrian detection, counting, and tracking have received significant attention in the computer vision community of late. As such, a variety of techniques have been investigated using both traditional 2-D computer vision techniques and, more recently, 3-D stereo information. However, to date, a quantitative assessment of the performance of stereo-based pedestrian detection has been problematic, mainly due to the lack of standard stereo-based test data and an agreed methodology for carrying out the evaluation. This has forced researchers into making subjective comparisons between competing approaches. In this paper, we propose a framework for the quantitative evaluation of a short-baseline stereo-based pedestrian detection system. We provide freely available synthetic and real-world test data and recommend a set of evaluation metrics. This allows researchers to benchmark systems, not only with respect to other stereo-based approaches, but also with more traditional 2-D approaches. In order to illustrate its usefulness, we demonstrate the application of this framework to evaluate our own recently proposed technique for pedestrian detection and tracking
A distributed camera system for multi-resolution surveillance
We describe an architecture for a multi-camera, multi-resolution surveillance system. The aim is to support a set of distributed static and pan-tilt-zoom (PTZ) cameras and visual tracking algorithms, together with a central supervisor unit. Each camera (and possibly pan-tilt device) has a dedicated process and processor.
Asynchronous interprocess communications and archiving of data are achieved in a simple and effective way via a central repository, implemented using an SQL database.
Visual tracking data from static views are stored dynamically into tables in the database via client calls to the SQL server. A supervisor process running on the SQL server determines if active zoom cameras should be dispatched to observe a particular target, and this message is effected via writing demands into another database table.
We show results from a real implementation of the system comprising one static camera overviewing the environment under consideration and a PTZ camera operating
under closed-loop velocity control, which uses a fast and robust level-set-based region tracker. Experiments demonstrate the effectiveness of our approach and its feasibility to multi-camera systems for intelligent surveillance
Parameter-unaware autocalibration for occupancy mapping
People localization and occupancy mapping are common and important tasks for multi-camera systems. In this paper, we present a novel approach to overcome the hurdle of manual extrinsic calibration of the multi-camera system. Our approach is completely parameter unaware, meaning that the user does not need to know the focal length, position or viewing angle in advance, nor will these values be calibrated as such. The only requirement to the multi-camera setup is that the views overlap substantially and are mounted at approximately the same height, requirements that are satisfied in most typical multi-camera configurations. The proposed method uses the observed height of an object or person moving through the space to estimate the distance to the object or person. Using this distance to backproject the lowest point of each detected object, we obtain a rotated and anisotropically scaled view of the ground plane for each camera. An algorithm is presented to estimate the anisotropic scaling parameters and rotation for each camera, after which ground plane positions can be computed up to an isotropic scale factor. Lens distortion is not taken into account. The method is tested in simulation yielding average accuracies within 5cm, and in a real multi-camera environment with an accuracy within 15cm
Sensor node localisation using a stereo camera rig
In this paper, we use stereo vision processing techniques to
detect and localise sensors used for monitoring simulated
environmental events within an experimental sensor network testbed. Our sensor nodes communicate to the camera through patterns emitted by light emitting diodes (LEDs). Ultimately, we envisage the use of very low-cost, low-power,
compact microcontroller-based sensing nodes that employ
LED communication rather than power hungry RF to transmit data that is gathered via existing CCTV infrastructure.
To facilitate our research, we have constructed a controlled
environment where nodes and cameras can be deployed and
potentially hazardous chemical or physical plumes can be
introduced to simulate environmental pollution events in a
controlled manner. In this paper we show how 3D spatial
localisation of sensors becomes a straightforward task when
a stereo camera rig is used rather than a more usual 2D
CCTV camera
Eye in the Sky: Real-time Drone Surveillance System (DSS) for Violent Individuals Identification using ScatterNet Hybrid Deep Learning Network
Drone systems have been deployed by various law enforcement agencies to
monitor hostiles, spy on foreign drug cartels, conduct border control
operations, etc. This paper introduces a real-time drone surveillance system to
identify violent individuals in public areas. The system first uses the Feature
Pyramid Network to detect humans from aerial images. The image region with the
human is used by the proposed ScatterNet Hybrid Deep Learning (SHDL) network
for human pose estimation. The orientations between the limbs of the estimated
pose are next used to identify the violent individuals. The proposed deep
network can learn meaningful representations quickly using ScatterNet and
structural priors with relatively fewer labeled examples. The system detects
the violent individuals in real-time by processing the drone images in the
cloud. This research also introduces the aerial violent individual dataset used
for training the deep network which hopefully may encourage researchers
interested in using deep learning for aerial surveillance. The pose estimation
and violent individuals identification performance is compared with the
state-of-the-art techniques.Comment: To Appear in the Efficient Deep Learning for Computer Vision (ECV)
workshop at IEEE Computer Vision and Pattern Recognition (CVPR) 2018. Youtube
demo at this: https://www.youtube.com/watch?v=zYypJPJipY
- …