278,417 research outputs found

    Real-time 3D face localization and verification

    Get PDF
    We present a method for real-time 3D face localization and verification using a consumer-grade depth camera. Our approach consists of three parts: face detection, head pose estimation, and face verification. Face detection is performed using a standard detection framework which we significantly improve by leveraging depth information. To estimate the pose of the detected face, we developed a technique that uses a combination of the particle swarm optimization (PSO) and the iterative closest point (ICP) algorithm to accurately align a 3D face model to the measured depth data. With the face localized within the image, we can compare a database 3D face model to the depth image to verify the identity of the subject. We learn a similarity metric using a random decision forest to accurately authenticate the subject. We demonstrate state-of-the-art results for both face localization and face verification on standard datasets. Since the camera and our method operate at video rate, our system is capable of continuously authenticating a subject while he/she uses his/her device

    Person Detection, Tracking and Identification by Mobile Robots Using RGB-D Images

    Get PDF
    This dissertation addresses the use of RGB-D images for six important tasks of mobile robots: face detection, face tracking, face pose estimation, face recognition, person de- tection and person tracking. These topics have widely been researched in recent years because they provide mobile robots with abilities necessary to communicate with humans in natural ways. The RGB-D images from a Microsoft Kinect cameras are expected to play an important role in improving both accuracy and computational costs of the proposed algorithms for mobile robots. We contribute some applications of the Microsoft Kinect camera for mobile robots and show their effectiveness by doing realistic experiments on our mobile robots. An important component for mobile robots to interact with humans in a natural way is real time multiple face detection. Various face detection algorithms for mobile robots have been proposed; however, almost all of them have not yet met the requirements of accuracy and speed to run in real time on a robot platform. In the scope of our re- search, we have developed a method of combining color and depth images provided by a Kinect camera and navigation information for face detection on mobile robots. We demonstrate several experiments with challenging datasets. Our results show that this method improves the accuracy and computational costs, and it runs in real time in indoor environments. Tracking faces in uncontrolled environments has still remained a challenging task be- cause the face as well as the background changes quickly over time and the face often moves through different illumination conditions. RGB-D images are beneficial for this task because the mobile robot can easily estimate the face size and improve the perfor- mance of face tracking in different distances between the mobile robot and the human. In this dissertation, we present a real time algorithm for mobile robots to track human faces accurately despite the fact that humans can move freely and far away from the camera or go through different illumination conditions in uncontrolled environments. We combine the algorithm of an adaptive correlation filter (David S. Bolme and Lui (2010)) with a Viola-Jones object detection (Viola and Jones (2001b)) to track the face. Furthermore,we introduce a new technique of face pose estimation, which is applied after tracking the face. On the tracked face, the algorithm of an adaptive correlation filter with a Viola-Jones object detection is also applied to reliably track the facial features including the two external eye corners and the nose. These facial features provide geometric cues to estimate the face pose robustly. We carefully analyze the accuracy of these approaches based on different datasets and show how they can robustly run on a mobile robot in uncontrolled environments. Both face tracking and face pose estimation play key roles as essential preprocessing steps for robust face recognition on mobile robots. The ability to recognize faces is a crucial element for human-robot interaction. Therefore, we pursue an approach for mobile robots to detect, track and recognize human faces accurately, even though they go through different illumination conditions. For the sake of improved accuracy, recognizing the tracked face is established by using an algorithm that combines local ternary patterns and collaborative representation based classification. This approach inherits the advantages of both collaborative representation based classification, which is fast and relatively accurate, and local ternary patterns, which is robust to misalignment of faces and complex illumination conditions. This combination enhances the efficiency of face recognition under different illumination and noisy conditions. Our method achieves high recognition rates on challenging face databases and can run in real time on mobile robots. An important application field of RGB-D images is person detection and tracking by mobile robots. Compared to classical RGB images, RGB-D images provide more depth information to locate humans more precisely and reliably. For this purpose, the mobile robot moves around in its environment and continuously detects and tracks people reliably, even when humans often change in a wide variety of poses, and are frequently occluded. We have improved the performance of face and upper body detection to enhance the efficiency of person detection in dealing with partial occlusions and changes in human poses. In order to handle higher challenges of complex changes of human poses and occlusions, we concurrently use a fast compressive tracker and a Kalman filter to track the detected humans. Experimental results on a challenging database show that our method achieves high performance and can run in real time on mobile robots

    A Novel Low Processing Time System for Criminal Activities Detection Applied to Command and Control Citizen Security Centers

    Full text link
    [EN] This paper shows a Novel Low Processing Time System focused on criminal activities detection based on real-time video analysis applied to Command and Control Citizen Security Centers. This system was applied to the detection and classification of criminal events in a real-time video surveillance subsystem in the Command and Control Citizen Security Center of the Colombian National Police. It was developed using a novel application of Deep Learning, specifically a Faster Region-Based Convolutional Network (R-CNN) for the detection of criminal activities treated as "objects" to be detected in real-time video. In order to maximize the system efficiency and reduce the processing time of each video frame, the pretrained CNN (Convolutional Neural Network) model AlexNet was used and the fine training was carried out with a dataset built for this project, formed by objects commonly used in criminal activities such as short firearms and bladed weapons. In addition, the system was trained for street theft detection. The system can generate alarms when detecting street theft, short firearms and bladed weapons, improving situational awareness and facilitating strategic decision making in the Command and Control Citizen Security Center of the Colombian National Police.This work was co-funded by the European Commission as part of H2020 call SEC-12-FCT-2016-Subtopic3 under the project VICTORIA (No. 740754). This publication reflects the views only of the authors and the Commission cannot be held responsible for any use which may be made of the information contained therein.Suarez-Paez, J.; Salcedo-Gonzalez, M.; Climente, A.; Esteve Domingo, M.; Gomez, J.; Palau Salvador, CE.; PĂ©rez Llopis, I. (2019). A Novel Low Processing Time System for Criminal Activities Detection Applied to Command and Control Citizen Security Centers. Information. 10(12):1-19. https://doi.org/10.3390/info10120365S1191012Wang, L., Rodriguez, R. M., & Wang, Y.-M. (2018). A dynamic multi-attribute group emergency decision making method considering expertsr hesitation. International Journal of Computational Intelligence Systems, 11(1), 163. doi:10.2991/ijcis.11.1.13Esteve, M., Perez-Llopis, I., & Palau, C. E. (2013). Friendly Force Tracking COTS solution. IEEE Aerospace and Electronic Systems Magazine, 28(1), 14-21. doi:10.1109/maes.2013.6470440Senst, T., Eiselein, V., Kuhn, A., & Sikora, T. (2017). Crowd Violence Detection Using Global Motion-Compensated Lagrangian Features and Scale-Sensitive Video-Level Representation. IEEE Transactions on Information Forensics and Security, 12(12), 2945-2956. doi:10.1109/tifs.2017.2725820Shi, Y., Tian, Y., Wang, Y., & Huang, T. (2017). Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN. IEEE Transactions on Multimedia, 19(7), 1510-1520. doi:10.1109/tmm.2017.2666540Arunnehru, J., Chamundeeswari, G., & Bharathi, S. P. (2018). Human Action Recognition using 3D Convolutional Neural Networks with 3D Motion Cuboids in Surveillance Videos. Procedia Computer Science, 133, 471-477. doi:10.1016/j.procs.2018.07.059Kamel, A., Sheng, B., Yang, P., Li, P., Shen, R., & Feng, D. D. (2019). Deep Convolutional Neural Networks for Human Action Recognition Using Depth Maps and Postures. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(9), 1806-1819. doi:10.1109/tsmc.2018.2850149Zhang, B., Wang, L., Wang, Z., Qiao, Y., & Wang, H. (2018). Real-Time Action Recognition With Deeply Transferred Motion Vector CNNs. IEEE Transactions on Image Processing, 27(5), 2326-2339. doi:10.1109/tip.2018.2791180Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2016). Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142-158. doi:10.1109/tpami.2015.2437384Suarez-Paez, J., Salcedo-Gonzalez, M., Esteve, M., GĂłmez, J. A., Palau, C., & PĂ©rez-Llopis, I. (2018). Reduced computational cost prototype for street theft detection based on depth decrement in Convolutional Neural Network. Application to Command and Control Information Systems (C2IS) in the National Police of Colombia. International Journal of Computational Intelligence Systems, 12(1), 123. doi:10.2991/ijcis.2018.25905186Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137-1149. doi:10.1109/tpami.2016.2577031Hao, S., Wang, P., & Hu, Y. (2019). Haze Image Recognition Based on Brightness Optimization Feedback and Color Correction. Information, 10(2), 81. doi:10.3390/info10020081Peng, M., Wang, C., Chen, T., & Liu, G. (2016). NIRFaceNet: A Convolutional Neural Network for Near-Infrared Face Identification. Information, 7(4), 61. doi:10.3390/info7040061NVIDIA CUDAÂź Deep Neural Network library (cuDNN)https://developer.nvidia.com/cuda-downloadsWu, X., Lu, X., & Leung, H. (2018). A Video Based Fire Smoke Detection Using Robust AdaBoost. Sensors, 18(11), 3780. doi:10.3390/s18113780Park, J. H., Lee, S., Yun, S., Kim, H., & Kim, W.-T. (2019). Dependable Fire Detection System with Multifunctional Artificial Intelligence Framework. Sensors, 19(9), 2025. doi:10.3390/s19092025GarcĂ­a-Retuerta, D., BartolomĂ©, Á., Chamoso, P., & Corchado, J. M. (2019). Counter-Terrorism Video Analysis Using Hash-Based Algorithms. Algorithms, 12(5), 110. doi:10.3390/a12050110Zhao, B., Zhao, B., Tang, L., Han, Y., & Wang, W. (2018). Deep Spatial-Temporal Joint Feature Representation for Video Object Detection. Sensors, 18(3), 774. doi:10.3390/s18030774He, Z., & He, H. (2018). Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks. Symmetry, 10(9), 375. doi:10.3390/sym10090375Muhammad, K., Hamza, R., Ahmad, J., Lloret, J., Wang, H., & Baik, S. W. (2018). Secure Surveillance Framework for IoT Systems Using Probabilistic Image Encryption. IEEE Transactions on Industrial Informatics, 14(8), 3679-3689. doi:10.1109/tii.2018.2791944BarthĂ©lemy, J., Verstaevel, N., Forehead, H., & Perez, P. (2019). Edge-Computing Video Analytics for Real-Time Traffic Monitoring in a Smart City. Sensors, 19(9), 2048. doi:10.3390/s19092048Aqib, M., Mehmood, R., Alzahrani, A., Katib, I., Albeshri, A., & Altowaijri, S. M. (2019). Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs. Sensors, 19(9), 2206. doi:10.3390/s19092206Xu, S., Zou, S., Han, Y., & Qu, Y. (2018). Study on the Availability of 4T-APS as a Video Monitor and Radiation Detector in Nuclear Accidents. Sustainability, 10(7), 2172. doi:10.3390/su10072172Plageras, A. P., Psannis, K. E., Stergiou, C., Wang, H., & Gupta, B. B. (2018). Efficient IoT-based sensor BIG Data collection–processing and analysis in smart buildings. Future Generation Computer Systems, 82, 349-357. doi:10.1016/j.future.2017.09.082Jha, S., Dey, A., Kumar, R., & Kumar-Solanki, V. (2019). A Novel Approach on Visual Question Answering by Parameter Prediction using Faster Region Based Convolutional Neural Network. International Journal of Interactive Multimedia and Artificial Intelligence, 5(5), 30. doi:10.9781/ijimai.2018.08.004Cho, S., Baek, N., Kim, M., Koo, J., Kim, J., & Park, K. (2018). Face Detection in Nighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network. Sensors, 18(9), 2995. doi:10.3390/s18092995Zhang, J., Xing, W., Xing, M., & Sun, G. (2018). Terahertz Image Detection with the Improved Faster Region-Based Convolutional Neural Network. Sensors, 18(7), 2327. doi:10.3390/s18072327Bakheet, S., & Al-Hamadi, A. (2016). A Discriminative Framework for Action Recognition Using f-HOL Features. Information, 7(4), 68. doi:10.3390/info7040068(2018). Robust Eye Blink Detection Based on Eye Landmarks and Savitzky–Golay Filtering. Information, 9(4), 93. doi:10.3390/info9040093Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90. doi:10.1145/3065386Jetson Embedded Development Kit|NVIDIAhttps://developer.nvidia.com/embedded-computingNVIDIA TensorRT|NVIDIA Developerhttps://developer.nvidia.com/tensorrtNVIDIA DeepStream SDK|NVIDIA Developerhttps://developer.nvidia.com/deepstream-sdkFraga-Lamas, P., FernĂĄndez-CaramĂ©s, T., SuĂĄrez-Albela, M., Castedo, L., & GonzĂĄlez-LĂłpez, M. (2016). A Review on Internet of Things for Defense and Public Safety. Sensors, 16(10), 1644. doi:10.3390/s16101644Gomez, C., Shami, A., & Wang, X. (2018). Machine Learning Aided Scheme for Load Balancing in Dense IoT Networks. Sensors, 18(11), 3779. doi:10.3390/s18113779AMD Embedded RadeonTMhttps://www.amd.com/en/products/embedded-graphic

    A new fuzzy based algorithm for solving stereo vagueness in detecting and tracking people

    Get PDF
    This paper describes a system capable of detecting and tracking various people using a new approach based on colour, stereo vision and fuzzy logic. Initially, in the people detection phase, two fuzzy systems are used to filter out false positives of a face detector. Then, in the tracking phase, a new fuzzy logic based particle filter (FLPF) is proposed to fuse stereo and colour information assigning different confidence levels to each of these information sources. Information regarding depth and occlusion is used to create these confidence levels. This way, the system is able to keep track of people, in the reference camera image, even when either stereo information or colour information is confusing or not reliable. To carry out the tracking, the new FLPF is used, so that several particles are generated while several fuzzy systems compute the possibility that some of the generated particles correspond to the new position of people. Our technique outperforms two well known tracking approaches, one based on the method from Nummiaro et al. [1] and other based on the Kalman/meanshift tracker method in Comaniciu and Ramesh [2]. All these approaches were tested using several colour-with-distance sequences simulating real life scenarios. The results show that our system is able to keep track of people in most of the situations where other trackers fail, as well as to determine the size of their projections in the camera image. In addition, the method is fast enough for real time applications.FCT Scholarship SFRH/BD/22359/2005POPH/FSE (Programa Operacional Potencial Humano do Fundo Social Europeu)Spanish MCI Project TIN2007-66367Andalusian Regional Government project P09-TIC-0481

    Real-Time High-Resolution Multiple-Camera Depth Map Estimation Hardware and Its Applications

    Get PDF
    Depth information is used in a variety of 3D based signal processing applications such as autonomous navigation of robots and driving systems, object detection and tracking, computer games, 3D television, and free view-point synthesis. These applications require high accuracy and speed performances for depth estimation. Depth maps can be generated using disparity estimation methods, which are obtained from stereo matching between multiple images. The computational complexity of disparity estimation algorithms and the need of large size and bandwidth for the external and internal memory make the real-time processing of disparity estimation challenging, especially for high resolution images. This thesis proposes a high-resolution high-quality multiple-camera depth map estimation hardware. The proposed hardware is verified in real-time with a complete system from the initial image capture to the display and applications. The details of the complete system are presented. The proposed binocular and trinocular adaptive window size disparity estimation algorithms are carefully designed to be suitable to real-time hardware implementation by allowing efficient parallel and local processing while providing high-quality results. The proposed binocular and trinocular disparity estimation hardware implementations can process 55 frames per second on a Virtex-7 FPGA at a 1024 x 768 XGA video resolution for a 128 pixel disparity range. The proposed binocular disparity estimation hardware provides best quality compared to existing real-time high-resolution disparity estimation hardware implementations. A novel compressed-look up table based rectification algorithm and its real-time hardware implementation are presented. The low-complexity decompression process of the rectification hardware utilizes a negligible amount of LUT and DFF resources of the FPGA while it does not require the existence of external memory. The first real-time high-resolution free viewpoint synthesis hardware utilizing three-camera disparity estimation is presented. The proposed hardware generates high-quality free viewpoint video in real-time for any horizontally aligned arbitrary camera positioned between the leftmost and rightmost physical cameras. The full embedded system of the depth estimation is explained. The presented embedded system transfers disparity results together with synchronized RGB pixels to the PC for application development. Several real-time applications are developed on a PC using the obtained RGB+D results. The implemented depth estimation based real-time software applications are: depth based image thresholding, speed and distance measurement, head-hands-shoulders tracking, virtual mouse using hand tracking and face tracking integrated with free viewpoint synthesis. The proposed binocular disparity estimation hardware is implemented in an ASIC. The ASIC implementation of disparity estimation imposes additional constraints with respect to the FPGA implementation. These restrictions, their implemented efficient solutions and the ASIC implementation results are presented. In addition, a very high-resolution (82.3 MP) 360°x90° omnidirectional multiple camera system is proposed. The hemispherical camera system is able to view the target locations close to horizontal plane with more than two cameras. Therefore, it can be used in high-resolution 360° depth map estimation and its applications in the future
    • 

    corecore