14 research outputs found

    BiLSTM with CNN Features For HAR in Videos

    Get PDF
    El reconocimiento de acciones en videos es actualmente un tema de interés en el área de visión por computadora debido a sus potenciales aplicaciones tales como indexación en multimedia, vigilancia en espacios públicos, entre otras. En este trabajo proponemos una arquitectura CNN-BiLSTM. Primero, una red neuronal convolucional VGG16 previamente entrenada extrae las características del video de entrada. Luego, un BiLSTM clasifica el video en una clase en particular. Evaluamos el rendimiento de nuestro sistema utilizando la precisión como métrica de evaluación, obteniendo 40.9% y 78.1% para los conjuntos de datos HMDB-51 y LTCF-101 respectivamente.Sociedad Argentina de Informática e Investigación Operativ

    Faster and Accurate Compressed Video Action Recognition Straight from the Frequency Domain

    Full text link
    Human action recognition has become one of the most active field of research in computer vision due to its wide range of applications, like surveillance, medical, industrial environments, smart homes, among others. Recently, deep learning has been successfully used to learn powerful and interpretable features for recognizing human actions in videos. Most of the existing deep learning approaches have been designed for processing video information as RGB image sequences. For this reason, a preliminary decoding process is required, since video data are often stored in a compressed format. However, a high computational load and memory usage is demanded for decoding a video. To overcome this problem, we propose a deep neural network capable of learning straight from compressed video. Our approach was evaluated on two public benchmarks, the UCF-101 and HMDB-51 datasets, demonstrating comparable recognition performance to the state-of-the-art methods, with the advantage of running up to 2 times faster in terms of inference speed

    A Novel Low Processing Time System for Criminal Activities Detection Applied to Command and Control Citizen Security Centers

    Full text link
    [EN] This paper shows a Novel Low Processing Time System focused on criminal activities detection based on real-time video analysis applied to Command and Control Citizen Security Centers. This system was applied to the detection and classification of criminal events in a real-time video surveillance subsystem in the Command and Control Citizen Security Center of the Colombian National Police. It was developed using a novel application of Deep Learning, specifically a Faster Region-Based Convolutional Network (R-CNN) for the detection of criminal activities treated as "objects" to be detected in real-time video. In order to maximize the system efficiency and reduce the processing time of each video frame, the pretrained CNN (Convolutional Neural Network) model AlexNet was used and the fine training was carried out with a dataset built for this project, formed by objects commonly used in criminal activities such as short firearms and bladed weapons. In addition, the system was trained for street theft detection. The system can generate alarms when detecting street theft, short firearms and bladed weapons, improving situational awareness and facilitating strategic decision making in the Command and Control Citizen Security Center of the Colombian National Police.This work was co-funded by the European Commission as part of H2020 call SEC-12-FCT-2016-Subtopic3 under the project VICTORIA (No. 740754). This publication reflects the views only of the authors and the Commission cannot be held responsible for any use which may be made of the information contained therein.Suarez-Paez, J.; Salcedo-Gonzalez, M.; Climente, A.; Esteve Domingo, M.; Gomez, J.; Palau Salvador, CE.; Pérez Llopis, I. (2019). A Novel Low Processing Time System for Criminal Activities Detection Applied to Command and Control Citizen Security Centers. Information. 10(12):1-19. https://doi.org/10.3390/info10120365S1191012Wang, L., Rodriguez, R. M., & Wang, Y.-M. (2018). A dynamic multi-attribute group emergency decision making method considering expertsr hesitation. International Journal of Computational Intelligence Systems, 11(1), 163. doi:10.2991/ijcis.11.1.13Esteve, M., Perez-Llopis, I., & Palau, C. E. (2013). Friendly Force Tracking COTS solution. IEEE Aerospace and Electronic Systems Magazine, 28(1), 14-21. doi:10.1109/maes.2013.6470440Senst, T., Eiselein, V., Kuhn, A., & Sikora, T. (2017). Crowd Violence Detection Using Global Motion-Compensated Lagrangian Features and Scale-Sensitive Video-Level Representation. IEEE Transactions on Information Forensics and Security, 12(12), 2945-2956. doi:10.1109/tifs.2017.2725820Shi, Y., Tian, Y., Wang, Y., & Huang, T. (2017). Sequential Deep Trajectory Descriptor for Action Recognition With Three-Stream CNN. IEEE Transactions on Multimedia, 19(7), 1510-1520. doi:10.1109/tmm.2017.2666540Arunnehru, J., Chamundeeswari, G., & Bharathi, S. P. (2018). Human Action Recognition using 3D Convolutional Neural Networks with 3D Motion Cuboids in Surveillance Videos. Procedia Computer Science, 133, 471-477. doi:10.1016/j.procs.2018.07.059Kamel, A., Sheng, B., Yang, P., Li, P., Shen, R., & Feng, D. D. (2019). Deep Convolutional Neural Networks for Human Action Recognition Using Depth Maps and Postures. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49(9), 1806-1819. doi:10.1109/tsmc.2018.2850149Zhang, B., Wang, L., Wang, Z., Qiao, Y., & Wang, H. (2018). Real-Time Action Recognition With Deeply Transferred Motion Vector CNNs. IEEE Transactions on Image Processing, 27(5), 2326-2339. doi:10.1109/tip.2018.2791180Girshick, R., Donahue, J., Darrell, T., & Malik, J. (2016). Region-Based Convolutional Networks for Accurate Object Detection and Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38(1), 142-158. doi:10.1109/tpami.2015.2437384Suarez-Paez, J., Salcedo-Gonzalez, M., Esteve, M., Gómez, J. A., Palau, C., & Pérez-Llopis, I. (2018). Reduced computational cost prototype for street theft detection based on depth decrement in Convolutional Neural Network. Application to Command and Control Information Systems (C2IS) in the National Police of Colombia. International Journal of Computational Intelligence Systems, 12(1), 123. doi:10.2991/ijcis.2018.25905186Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137-1149. doi:10.1109/tpami.2016.2577031Hao, S., Wang, P., & Hu, Y. (2019). Haze Image Recognition Based on Brightness Optimization Feedback and Color Correction. Information, 10(2), 81. doi:10.3390/info10020081Peng, M., Wang, C., Chen, T., & Liu, G. (2016). NIRFaceNet: A Convolutional Neural Network for Near-Infrared Face Identification. Information, 7(4), 61. doi:10.3390/info7040061NVIDIA CUDA® Deep Neural Network library (cuDNN)https://developer.nvidia.com/cuda-downloadsWu, X., Lu, X., & Leung, H. (2018). A Video Based Fire Smoke Detection Using Robust AdaBoost. Sensors, 18(11), 3780. doi:10.3390/s18113780Park, J. H., Lee, S., Yun, S., Kim, H., & Kim, W.-T. (2019). Dependable Fire Detection System with Multifunctional Artificial Intelligence Framework. Sensors, 19(9), 2025. doi:10.3390/s19092025García-Retuerta, D., Bartolomé, Á., Chamoso, P., & Corchado, J. M. (2019). Counter-Terrorism Video Analysis Using Hash-Based Algorithms. Algorithms, 12(5), 110. doi:10.3390/a12050110Zhao, B., Zhao, B., Tang, L., Han, Y., & Wang, W. (2018). Deep Spatial-Temporal Joint Feature Representation for Video Object Detection. Sensors, 18(3), 774. doi:10.3390/s18030774He, Z., & He, H. (2018). Unsupervised Multi-Object Detection for Video Surveillance Using Memory-Based Recurrent Attention Networks. Symmetry, 10(9), 375. doi:10.3390/sym10090375Muhammad, K., Hamza, R., Ahmad, J., Lloret, J., Wang, H., & Baik, S. W. (2018). Secure Surveillance Framework for IoT Systems Using Probabilistic Image Encryption. IEEE Transactions on Industrial Informatics, 14(8), 3679-3689. doi:10.1109/tii.2018.2791944Barthélemy, J., Verstaevel, N., Forehead, H., & Perez, P. (2019). Edge-Computing Video Analytics for Real-Time Traffic Monitoring in a Smart City. Sensors, 19(9), 2048. doi:10.3390/s19092048Aqib, M., Mehmood, R., Alzahrani, A., Katib, I., Albeshri, A., & Altowaijri, S. M. (2019). Smarter Traffic Prediction Using Big Data, In-Memory Computing, Deep Learning and GPUs. Sensors, 19(9), 2206. doi:10.3390/s19092206Xu, S., Zou, S., Han, Y., & Qu, Y. (2018). Study on the Availability of 4T-APS as a Video Monitor and Radiation Detector in Nuclear Accidents. Sustainability, 10(7), 2172. doi:10.3390/su10072172Plageras, A. P., Psannis, K. E., Stergiou, C., Wang, H., & Gupta, B. B. (2018). Efficient IoT-based sensor BIG Data collection–processing and analysis in smart buildings. Future Generation Computer Systems, 82, 349-357. doi:10.1016/j.future.2017.09.082Jha, S., Dey, A., Kumar, R., & Kumar-Solanki, V. (2019). A Novel Approach on Visual Question Answering by Parameter Prediction using Faster Region Based Convolutional Neural Network. International Journal of Interactive Multimedia and Artificial Intelligence, 5(5), 30. doi:10.9781/ijimai.2018.08.004Cho, S., Baek, N., Kim, M., Koo, J., Kim, J., & Park, K. (2018). Face Detection in Nighttime Images Using Visible-Light Camera Sensors with Two-Step Faster Region-Based Convolutional Neural Network. Sensors, 18(9), 2995. doi:10.3390/s18092995Zhang, J., Xing, W., Xing, M., & Sun, G. (2018). Terahertz Image Detection with the Improved Faster Region-Based Convolutional Neural Network. Sensors, 18(7), 2327. doi:10.3390/s18072327Bakheet, S., & Al-Hamadi, A. (2016). A Discriminative Framework for Action Recognition Using f-HOL Features. Information, 7(4), 68. doi:10.3390/info7040068(2018). Robust Eye Blink Detection Based on Eye Landmarks and Savitzky–Golay Filtering. Information, 9(4), 93. doi:10.3390/info9040093Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90. doi:10.1145/3065386Jetson Embedded Development Kit|NVIDIAhttps://developer.nvidia.com/embedded-computingNVIDIA TensorRT|NVIDIA Developerhttps://developer.nvidia.com/tensorrtNVIDIA DeepStream SDK|NVIDIA Developerhttps://developer.nvidia.com/deepstream-sdkFraga-Lamas, P., Fernández-Caramés, T., Suárez-Albela, M., Castedo, L., & González-López, M. (2016). A Review on Internet of Things for Defense and Public Safety. Sensors, 16(10), 1644. doi:10.3390/s16101644Gomez, C., Shami, A., & Wang, X. (2018). Machine Learning Aided Scheme for Load Balancing in Dense IoT Networks. Sensors, 18(11), 3779. doi:10.3390/s18113779AMD Embedded RadeonTMhttps://www.amd.com/en/products/embedded-graphic

    Biased Mixtures Of Experts: Enabling Computer Vision Inference Under Data Transfer Limitations

    Get PDF
    We propose a novel mixture-of-experts class to optimize computer vision models in accordance with data transfer limitations at test time. Our approach postulates that the minimum acceptable amount of data allowing for highly-accurate results can vary for different input space partitions. Therefore, we consider mixtures where experts require different amounts of data, and train a sparse gating function to divide the input space for each expert. By appropriate hyperparameter selection, our approach is able to bias mixtures of experts towards selecting specific experts over others. In this way, we show that the data transfer optimization between visual sensing and processing can be solved as a convex optimization problem.To demonstrate the relation between data availability and performance, we evaluate biased mixtures on a range of mainstream computer vision problems, namely: (i) single shot detection, (ii) image super resolution, and (iii) realtime video action classification. For all cases, and when experts constitute modified baselines to meet different limits on allowed data utility, biased mixtures significantly outperform previous work optimized to meet the same constraints on available data
    corecore