153 research outputs found

    A Low Cost and Computationally Efficient Approach for Occlusion Handling in Video Surveillance Systems

    Get PDF
    In the development of intelligent video surveillance systems for tracking a vehicle, occlusions are one of the major challenges. It becomes difficult to retain features during occlusion especially in case of complete occlusion. In this paper, a target vehicle tracking algorithm for Smart Video Surveillance (SVS) is proposed to track an unidentified target vehicle even in case of occlusions. This paper proposes a computationally efficient approach for handling occlusions named as Kalman Filter Assisted Occlusion Handling (KFAOH) technique. The algorithm works through two periods namely tracking period when no occlusion is seen and detection period when occlusion occurs, thus depicting its hybrid nature. Kanade-Lucas-Tomasi (KLT) feature tracker governs the operation of algorithm during the tracking period, whereas, a Cascaded Object Detector (COD) of weak classifiers, specially trained on a large database of cars governs the operation during detection period or occlusion with the assistance of Kalman Filter (KF). The algorithm’s tracking efficiency has been tested on six different tracking scenarios with increasing complexity in real-time. Performance evaluation under different noise variances and illumination levels shows that the tracking algorithm has good robustness against high noise and low illumination. All tests have been conducted on the MATLAB platform. The validity and practicality of the algorithm are also verified by success plots and precision plots for the test cases

    On the Hardware/Software Design and Implementation of a High Definition Multiview Video Surveillance System

    Get PDF
    published_or_final_versio

    Real-Time, Multiple Pan/Tilt/Zoom Computer Vision Tracking and 3D Positioning System for Unmanned Aerial System Metrology

    Get PDF
    The study of structural characteristics of Unmanned Aerial Systems (UASs) continues to be an important field of research for developing state of the art nano/micro systems. Development of a metrology system using computer vision (CV) tracking and 3D point extraction would provide an avenue for making these theoretical developments. This work provides a portable, scalable system capable of real-time tracking, zooming, and 3D position estimation of a UAS using multiple cameras. Current state-of-the-art photogrammetry systems use retro-reflective markers or single point lasers to obtain object poses and/or positions over time. Using a CV pan/tilt/zoom (PTZ) system has the potential to circumvent their limitations. The system developed in this paper exploits parallel-processing and the GPU for CV-tracking, using optical flow and known camera motion, in order to capture a moving object using two PTU cameras. The parallel-processing technique developed in this work is versatile, allowing the ability to test other CV methods with a PTZ system using known camera motion. Utilizing known camera poses, the object\u27s 3D position is estimated and focal lengths are estimated for filling the image to a desired amount. This system is tested against truth data obtained using an industrial system

    Reconfigurable Vision Processing for Player Tracking in Indoor Sports

    Get PDF
    Ibraheem OW. Reconfigurable Vision Processing for Player Tracking in Indoor Sports. Bielefeld: Universität Bielefeld; 2018.Over the past decade, there has been an increasing growth of using vision-based systems for tracking players in sports. The tracking results are used to evaluate and enhance the performance of the players as well as to provide detailed information (e.g., on the players and team performance) to viewers. Player tracking using vision systems is a very challenging task due to the nature of sports games, which includes severe and frequent interactions (e.g., occlusions) between the players. Additionally, these vision systems have high computational demands since they require processing of a huge amount of video data based on the utilization of multiple cameras with high resolution and high frame rate. As a result, most of the existing systems based on general-purpose computers are not able to perform online real-time player tracking, but track the players offline using pre-recorded video files, limiting, e.g., direct feedback on the player performance during the game. In this thesis, a reconfigurable vision-based system for automatically tracking the players in indoor sports is presented. The proposed system targets player tracking for basketball and handball games. It processes the incoming video streams from GigE Vision cameras, achieving online real-time player tracking. The teams are identified and the players are detected based on the colors of their jerseys, using background subtraction, color thresholding, and graph clustering techniques. Moreover, the trackingby-detection approach is used to realize player tracking. FPGA technology is used to handle the compute-intensive vision processing tasks by implementing the video acquisition, video preprocessing, player segmentation, and team identification & player detection in hardware, while the less compute-intensive player tracking is performed on the CPU of a host-PC. Player detection and tracking are evaluated using basketball and handball datasets. The results of this work show that the maximum achieved frame rate for the FPGA implementation is 96.7 fps using a Xilinx Virtex-4 FPGA and 136.4 fps using a Virtex-7 device. The player tracking requires an average processing time of 2.53 ms per frame in a host-PC equipped with a 2.93 GHz Intel i7-870 CPU. As a result, the proposed reconfigurable system supports a maximum frame rate of 77.6 fps using two GigE Vision cameras with a resolution of 1392x1040 pixels each. Using the FPGA implementation, a speedup by a factor of 15.5 is achieved compared to an OpenCV-based software implementation in a host-PC. Additionally, the results show a high accuracy for player tracking. In particular, the achieved average precision and recall for player detection are up to 84.02% and 96.6%, respectively. For player tracking, the achieved average precision and recall are up to 94.85% and 94.72%, respectively. Furthermore, the proposed reconfigurable system achieves a 2.4 times higher performance per Watt than a software-based implementation (without FPGA support) for player tracking in a host-PC.Acknowledgments: I (Omar W. Ibraheem) would like to thank the German Academic Exchange Service (DAAD), the Congnitronics and Sensor Systems research group, and the Cluster of Excellence Cognitive Interaction Technology ‘CITEC’ (EXC 277) (Bielefeld University) not only for funding the work in this thesis, but also for all the help and support they gave to successfully finish my thesis

    Face modeling for face recognition in the wild.

    Get PDF
    Face understanding is considered one of the most important topics in computer vision field since the face is a rich source of information in social interaction. Not only does the face provide information about the identity of people, but also of their membership in broad demographic categories (including sex, race, and age), and about their current emotional state. Facial landmarks extraction is the corner stone in the success of different facial analyses and understanding applications. In this dissertation, a novel facial modeling is designed for facial landmarks detection in unconstrained real life environment from different image modalities including infra-red and visible images. In the proposed facial landmarks detector, a part based model is incorporated with holistic face information. In the part based model, the face is modeled by the appearance of different face part(e.g., right eye, left eye, left eyebrow, nose, mouth) and their geometric relation. The appearance is described by a novel feature referred to as pixel difference feature. This representation is three times faster than the state-of-art in feature representation. On the other hand, to model the geometric relation between the face parts, the complex Bingham distribution is adapted from the statistical community into computer vision for modeling the geometric relationship between the facial elements. The global information is incorporated with the local part model using a regression model. The model results outperform the state-of-art in detecting facial landmarks. The proposed facial landmark detector is tested in two computer vision problems: boosting the performance of face detectors by rejecting pseudo faces and camera steering in multi-camera network. To highlight the applicability of the proposed model for different image modalities, it has been studied in two face understanding applications which are face recognition from visible images and physiological measurements for autistic individuals from thermal images. Recognizing identities from faces under different poses, expressions and lighting conditions from a complex background is an still unsolved problem even with accurate detection of landmark. Therefore, a learning similarity measure is proposed. The proposed measure responds only to the difference in identities and filter illuminations and pose variations. similarity measure makes use of statistical inference in the image plane. Additionally, the pose challenge is tackled by two new approaches: assigning different weights for different face part based on their visibility in image plane at different pose angles and synthesizing virtual facial images for each subject at different poses from single frontal image. The proposed framework is demonstrated to be competitive with top performing state-of-art methods which is evaluated on standard benchmarks in face recognition in the wild. The other framework for the face understanding application, which is a physiological measures for autistic individual from infra-red images. In this framework, accurate detecting and tracking Superficial Temporal Arteria (STA) while the subject is moving, playing, and interacting in social communication is a must. It is very challenging to track and detect STA since the appearance of the STA region changes over time and it is not discriminative enough from other areas in face region. A novel concept in detection, called supporter collaboration, is introduced. In support collaboration, the STA is detected and tracked with the help of face landmarks and geometric constraint. This research advanced the field of the emotion recognition

    Improving Indoor Security Surveillance by Fusing Data from BIM, UWB and Video

    Get PDF
    Indoor physical security, as a perpetual and multi-layered phenomenon, is a time-intensive and labor-consuming task. Various technologies have been leveraged to develop automatic access control, intrusion detection, or video monitoring systems. Video surveillance has been significantly enhanced by the advent of Pan-Tilt-Zoom (PTZ) cameras and advanced video processing, which together enable effective monitoring and recording. The development of ubiquitous object identification and tracking technologies provides the opportunity to accomplish automatic access control and tracking. Intrusion detection has also become possible through deploying networks of motion sensors for alerting about abnormal behaviors. However, each of the above-mentioned technologies has its own limitations. This thesis presents a fully automated indoor security solution that leverages an Ultra-wideband (UWB) Real-Time Locating System (RTLS), PTZ surveillance cameras and a Building Information Model (BIM) as three sources of environmental data. Providing authorized persons with UWB tags, unauthorized intruders are distinguished as the mismatch observed between the detected tag owners and the persons detected in the video, and intrusion alert is generated. PTZ cameras allow for wide-area monitoring and motion-based recording. Furthermore, the BIM is used for space modeling and mapping the locations of intruders in the building. Fusing UWB tracking, video and spatial data can automate the entire security procedure from access control to intrusion alerting and behavior monitoring. Other benefits of the proposed method include more complex query processing and interoperability with other BIM-based solutions. A prototype system is implemented that demonstrates the feasibility of the proposed method

    Real-time video scene analysis with heterogeneous processors

    Get PDF
    Field-Programmable Gate Arrays (FPGAs) and General Purpose Graphics Processing Units (GPUs) allow acceleration and real-time processing of computationally intensive computer vision algorithms. The decision to use either architecture in any application is determined by task-specific priorities such as processing latency, power consumption and algorithm accuracy. This choice is normally made at design time on a heuristic or fixed algorithmic basis; here we propose an alternative method for automatic runtime selection. In this thesis, we describe our PC-based system architecture containing both platforms; this provides greater flexibility and allows dynamic selection of processing platforms to suit changing scene priorities. Using the Histograms of Oriented Gradients (HOG) algorithm for pedestrian detection, we comprehensively explore algorithm implementation on FPGA, GPU and a combination of both, and show that the effect of data transfer time on overall processing performance is significant. We also characterise performance of each implementation and quantify tradeoffs between power, time and accuracy when moving processing between architectures, then specify the optimal architecture to use when prioritising each of these. We apply this new knowledge to a real-time surveillance application representative of anomaly detection problems: detecting parked vehicles in videos. Using motion detection and car and pedestrian HOG detectors implemented across multiple architectures to generate detections, we use trajectory clustering and a Bayesian contextual motion algorithm to generate an overall scene anomaly level. This is in turn used to select the architectures to run the compute-intensive detectors for the next frame on, with higher anomalies selecting faster, higher-power implementations. Comparing dynamic context-driven prioritisation of system performance against a fixed mapping of algorithms to architectures shows that our dynamic mapping method is 10% more accurate at detecting events than the power-optimised version, at the cost of 12W higher power consumption

    People detection and tracking using a network of low-cost depth cameras

    Get PDF
    Automaattinen ihmisten havainnointi on jo laajalti käytetty teknologia, jolla on sovelluksia esimerkiksi kaupan ja turvallisuuden aloilla. Tämän diplomityön tarkoituksena on suunnitella yleiskäyttöinen järjestelmä ihmisten havainnointiin sisätiloissa. Tässä työssä ensin esitetään kirjallisuudesta löytyvät ratkaisut ihmisten havainnointiin, seurantaan ja tunnistamiseen. Painopiste on syvyyskuvaa hyödyntävissä havaitsemismenetelmissä. Lisäksi esittellään kehitetty älykkäiden syvyyskameroiden verkko. Havainnointitarkkuutta kokeillaan neljällä kuvasarjalla, jotka sisältävät yli 20 000 syvyyskuvaa. Tulokset ovat lupaavia ja näyttävät, että yksinkertaiset ja laskennallisesti kevyet ratkaisut sopivat hyvin käytännön sovelluksiin.Automatic people detection is a widely adopted technology that has applications in retail stores, crowd management and surveillance. The goal of this work is to create a general purpose people detection framework. First, studies on people detection, tracking and re-identification are reviewed. The emphasis is on people detection from depth images. Furthermore, an approach based on a network of smart depth cameras is presented. The performance is evaluated with four image sequences, totalling over 20 000 depth images. Experimental results show that simple and lightweight algorithms are very useful in practical applications
    corecore