2 research outputs found
Design Of Computer Vision Systems For Optimizing The Threat Detection Accuracy
This dissertation considers computer vision (CV) systems in which a central monitoring station receives and analyzes the video streams captured and delivered wirelessly by multiple cameras. It addresses how the bandwidth can be allocated to various cameras by presenting a cross-layer solution that optimizes the overall detection or recognition accuracy. The dissertation presents and develops a real CV system and subsequently provides a detailed experimental analysis of cross-layer optimization. Other unique features of the developed solution include employing the popular HTTP streaming approach, utilizing homogeneous cameras as well as heterogeneous ones with varying capabilities and limitations, and including a new algorithm for estimating the effective medium airtime. The results show that the proposed solution significantly improves the CV accuracy.
Additionally, the dissertation features an improved neural network system for object detection. The proposed system considers inherent video characteristics and employs different motion detection and clustering algorithms to focus on the areas of importance in consecutive frames, allowing the system to dynamically and efficiently distribute the detection task among multiple deployments of object detection neural networks. Our experimental results indicate that our proposed method can enhance the mAP (mean average precision), execution time, and required data transmissions to object detection networks.
Finally, as recognizing an activity provides significant automation prospects in CV systems, the dissertation presents an efficient activity-detection recurrent neural network that utilizes fast pose/limbs estimation approaches. By combining object detection with pose estimation, the domain of activity detection is shifted from a volume of RGB (Red, Green, and Blue) pixel values to a time-series of relatively small one-dimensional arrays, thereby allowing the activity detection system to take advantage of highly capable neural networks that have been trained on large GPU clusters for thousands of hours. Consequently, capable activity detection systems with considerably fewer training sets and processing hours can be built
Towards Optimal Ptz Camera Scheduling
The automatic control of Pan/Tilt/Zoom (PTZ) cameras has been a major research problem. We consider the control of PTZ cameras in a manner that optimizes the overall recognition accuracy. The camera control solution operates into two alternating phases: pre-recording and recording. In the first phase, the processing architecture performs the necessary algorithmic calculations to determine the optimal PTZ camera setting. However, in the second phase, the PTZ cameras apply the desired settings, capture the videos, and stream these videos to the proxy station for analysis.
We enhance the overall recognition accuracy by developing a parallel PTZ control algorithm, which reduces the time spent on pre-recording and thus increases the fraction of time dedicated to capturing the actual videos of the surveillance site. Additionally, we propose a dynamic approach for determining the pre-recording time and thus allowing the system to extract the best benefits of the parallel algorithm. As the parallel algorithm leads to early completion of the pre-recording tasks, the dynamic approach empowers
the system to benefit from the unused remaining time in the pre-recording phase and subsequently to place more dedication to the actual recording.
We analyze the effectiveness of the proposed solutions through extensive simulation, considering the impacts of major parameters, including the subject arrival rate, surveillance area, and the number of cameras. To make the simulations as realistic as possible, we incorporate an inclusive speed model to constantly update and maintain the speed values for related subjects while they are crossing throughout the surveillance site. This speed model considers many factors, including the social tendencies and density of the people
present in the surveillance site. Our overall solution assumes realistic 3D environments and not just 2D scenes.
We demonstrate that the proposed parallel algorithm substantially reduces the pre-recording time. We also show that the combination of the proposed parallel algorithm and dynamic approach greatly enhances the overall face recognition accuracy