315 research outputs found

    Eye in the Sky: Real-time Drone Surveillance System (DSS) for Violent Individuals Identification using ScatterNet Hybrid Deep Learning Network

    Full text link
    Drone systems have been deployed by various law enforcement agencies to monitor hostiles, spy on foreign drug cartels, conduct border control operations, etc. This paper introduces a real-time drone surveillance system to identify violent individuals in public areas. The system first uses the Feature Pyramid Network to detect humans from aerial images. The image region with the human is used by the proposed ScatterNet Hybrid Deep Learning (SHDL) network for human pose estimation. The orientations between the limbs of the estimated pose are next used to identify the violent individuals. The proposed deep network can learn meaningful representations quickly using ScatterNet and structural priors with relatively fewer labeled examples. The system detects the violent individuals in real-time by processing the drone images in the cloud. This research also introduces the aerial violent individual dataset used for training the deep network which hopefully may encourage researchers interested in using deep learning for aerial surveillance. The pose estimation and violent individuals identification performance is compared with the state-of-the-art techniques.Comment: To Appear in the Efficient Deep Learning for Computer Vision (ECV) workshop at IEEE Computer Vision and Pattern Recognition (CVPR) 2018. Youtube demo at this: https://www.youtube.com/watch?v=zYypJPJipY

    Substracción de fondo y algoritmo yolo: Dos métodos para la detección de personas en entornos descontrolados

    Get PDF
    Introduction: This article is the result of research entitled “Signal processing system for the detection of people in agglomerations in areas of public space in the city of Cúcuta”, developed at the Universidad Francisco de Paula Santander in 2020.Problem: The high percentage of false positives and false negatives in people detection processes makes decision making in video surveillance, tracking and tracing applications complex. Objective: To determine which technique for the detection of people presents better results in terms of respon-se time and detection hits.Methodology: Two techniques for the detection of people in uncontrolled environments are validated in Python with videos taken inside the Universidad Francisco de Paula Santander: Background subtraction and the YOLO algorithm.Results: With the background subtraction technique, we obtained a hit rate of 84.07 % and an average response time of 0.815 seconds. Likewise, with the YOLO algorithm the hit rate and average response time are 90% and 4.59 seconds respectively.Conclusion: It is possible to infer the use of the background subtraction technique in hardware tools such as the Pi 3B+ Raspberry board for processes in which the analysis of information in real time is prioritized, while the YOLO algorithm presents the characteristics required in the processes in which the information is analyzed after the acquisition of the image.Originality: Through this research, aspects required for the real-time analysis of information obtained in pro-cesses of people detection in uncontrolled environments were analyzed. Limitations: The analyzed videos were taken only at the Universidad Francisco de Paula Santander. Also, the Raspberry Pi 3B+ board overheats when processing the video images, due to the full resource requirement of the device.Introducción: Este artículo es resultado de la investigación titulada “Sistema de procesamiento de señales para la detección de personas en aglomeraciones en zonas de espacio público de la ciudad de Cúcuta”, desarrollada en la Universidad Francisco de Paula Santander en el año 2020.Problema: El alto porcentaje de falsos positivos y falsos negativos en los procesos de detección de personas hace que la toma de decisiones en las aplicaciones de videovigilancia, seguimiento y localización sea compleja. Objetivo: Determinar qué técnica de detección de personas presenta mejores resultados en cuanto a tiempo de respuesta y aciertos en la detección.Metodología: Dos técnicas para la detección de personas en entornos no controlados son validadas en Python con videos tomados dentro de la Universidad Francisco de Paula Santander: la sustracción de fondo y el al-goritmo YOLO.Resultados: Con la técnica de sustracción de fondo se obtuvo una tasa de acierto del 84,07 % y un tiempo de respuesta medio de 0,815 segundos. Asimismo, con el algoritmo YOLO, la tasa de acierto y el tiempo de respuesta promedio son del 90% y 4,59 segundos respectivamente.Conclusión: Es posible inferir el uso de la técnica de sustracción de fondo en herramientas de hardware como la placa Raspberry Pi 3B+ para procesos en los que se prioriza el análisis de la información en tiempo real, mientras que el algoritmo YOLO presenta las características requeridas en los procesos en los que se analiza la información después de la adquisición de la imagen.Originalidad: A través de esta investigación se analizaron los aspectos necesarios para el análisis en tiempo real de la información obtenida en los procesos de detección de personas en ambientes no controlados

    Survey on video anomaly detection in dynamic scenes with moving cameras

    Full text link
    The increasing popularity of compact and inexpensive cameras, e.g.~dash cameras, body cameras, and cameras equipped on robots, has sparked a growing interest in detecting anomalies within dynamic scenes recorded by moving cameras. However, existing reviews primarily concentrate on Video Anomaly Detection (VAD) methods assuming static cameras. The VAD literature with moving cameras remains fragmented, lacking comprehensive reviews to date. To address this gap, we endeavor to present the first comprehensive survey on Moving Camera Video Anomaly Detection (MC-VAD). We delve into the research papers related to MC-VAD, critically assessing their limitations and highlighting associated challenges. Our exploration encompasses three application domains: security, urban transportation, and marine environments, which in turn cover six specific tasks. We compile an extensive list of 25 publicly-available datasets spanning four distinct environments: underwater, water surface, ground, and aerial. We summarize the types of anomalies these datasets correspond to or contain, and present five main categories of approaches for detecting such anomalies. Lastly, we identify future research directions and discuss novel contributions that could advance the field of MC-VAD. With this survey, we aim to offer a valuable reference for researchers and practitioners striving to develop and advance state-of-the-art MC-VAD methods.Comment: Under revie

    Deep learning for automatic violence detection: tests on the AIRTLab dataset

    Get PDF
    Following the growing availability of video surveillance cameras and the need for techniques to automatically identify events in video footages, there is an increasing interest towards automatic violence detection in videos. Deep learning-based architectures, such as 3D Convolutional Neural Networks, demonstrated their capability of extracting spatio-temporal features from videos, being effective in violence detection. However, friendly behaviours or fast moves such as hugs, small hits, claps, high fives, etc., can still cause false positives, interpreting a harmless action as violent. To this end, we present three deep-learning based models for violence detection and test them on the AIRTLab dataset, a novel dataset designed to check the robustness of algorithms against false positives. The objective is twofold: on one hand, we compute accuracy metrics on the three proposed models (two are based on transfer learning and one is trained from scratch), building a baseline of metrics for the AIRTLab dataset; on the other hand, we validate the capability of the proposed dataset of challenging the robustness to false positives. The results of the proposed models are in line with the scientific literature, in terms of accuracy, with transfer learning-based networks exhibiting better generalization capabilities than the trained from scratch network. Moreover, the tests highlighted that most of the classification errors concern the identification of non-violent clips, validating the design of the proposed dataset. Finally, to demonstrate the significance of the proposed models, the paper presents a comparison with the related literature, as well as with models based on well-established pre-trained 2D Convolutional Neural Networks 2D CNNs. Such comparison highlights that 3D models get better accuracy performance than time distributed 2D CNNs (merged with a recurrent model) in processing the spatio-temporal features of video clips. The source code of the experiments and the AIRTLab dataset are available in public repositories

    Counter-terrorism in cyber–physical spaces:Best practices and technologies from the state of the art

    Get PDF
    Context: The demand for protection and security of physical spaces and urban areas increased with the escalation of terroristic attacks in recent years. We envision with the proposed cyber–physical systems and spaces, a city that would indeed become a smarter urbanistic object, proactively providing alerts and being protective against any threat. Objectives: This survey intend to provide a systematic multivocal literature survey comprised of an updated, comprehensive and timely overview of state of the art in counter-terrorism cyber–physical systems, hence aimed at the protection of cyber–physical spaces. Hence, provide guidelines to law enforcement agencies and practitioners providing a description of technologies and best practices for the protection of public spaces. Methods: We analyzed 112 papers collected from different online sources, both from the academic field and from websites and blogs ranging from 2004 till mid-2022. Results: (a) There is no one single bullet-proof solution available for the protection of public spaces. (b) From our analysis we found three major active fields for the protection of public spaces: Information Technologies, Architectural approaches, Organizational field. (c) While the academic suggest best practices and methodologies for the protection of urban areas, the market did not provide any type of implementation of such suggested approaches, which shows a lack of fertilization between academia and industry. Conclusion: The overall analysis has led us to state that there is no one single solution available, conversely, multiple methods and techniques can be put in place to guarantee safety and security in public spaces. The techniques range from architectural design to rethink the design of public spaces keeping security into account in continuity, to emerging technologies such as AI and predictive surveillance.</p

    Counter-terrorism in cyber-physical spaces: Best practices and technologies from the state of the art

    Full text link
    Context: The demand for protection and security of physical spaces and urban areas increased with the escalation of terroristic attacks in recent years. We envision with the proposed cyber-physical systems and spaces, a city that would indeed become a smarter urbanistic object, proactively providing alerts and being protective against any threat. Objectives: This survey intend to provide a systematic multivocal literature survey comprised of an updated, comprehensive and timely overview of state of the art in counter-terrorism cyber-physical systems, hence aimed at the protection of cyber-physical spaces. Hence, provide guidelines to law enforcement agencies and practitioners providing a description of technologies and best practices for the protection of public spaces. Methods: We analyzed 112 papers collected from different online sources, both from the academic field and from websites and blogs ranging from 2004 till mid-2022. Results: a) There is no one single bullet-proof solution available for the protection of public spaces. b) From our analysis we found three major active fields for the protection of public spaces: Information Technologies, Architectural approaches, Organizational field. c) While the academic suggest best practices and methodologies for the protection of urban areas, the market did not provide any type of implementation of such suggested approaches, which shows a lack of fertilization between academia and industry. Conclusion: The overall analysis has led us to state that there is no one single solution available, conversely, multiple methods and techniques can be put in place to guarantee safety and security in public spaces. The techniques range from architectural design to rethink the design of public spaces keeping security into account in continuity, to emerging technologies such as AI and predictive surveillance

    FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial Video Classification

    Get PDF
    Unmanned aerial vehicles (UAVs) are now widely applied to data acquisition due to its low cost and fast mobility. With the increasing volume of aerial videos, the demand for automatically parsing these videos is surging. To achieve this, current research mainly focuses on extracting a holistic feature with convolutions along both spatial and temporal dimensions. However, these methods are limited by small temporal receptive fields and cannot adequately capture long-term temporal dependencies that are important for describing complicated dynamics. In this article, we propose a novel deep neural network, termed Fusing Temporal relations and Holistic features for aerial video classification (FuTH-Net), to model not only holistic features but also temporal relations for aerial video classification. Furthermore, the holistic features are refined by the multiscale temporal relations in a novel fusion module for yielding more discriminative video representations. More specially, FuTH-Net employs a two-pathway architecture: 1) a holistic representation pathway to learn a general feature of both frame appearances and short-term temporal variations and 2) a temporal relation pathway to capture multiscale temporal relations across arbitrary frames, providing long-term temporal dependencies. Afterward, a novel fusion module is proposed to spatiotemporally integrate the two features learned from the two pathways. Our model is evaluated on two aerial video classification datasets, ERA and Drone-Action, and achieves the state-of-the-art results. This demonstrates its effectiveness and good generalization capacity across different recognition tasks (event classification and human action recognition). To facilitate further research, we release the code at https://gitlab.lrz.de/ai4eo/reasoning/futh-net

    A computer vision system for detecting and analysing critical events in cities

    Get PDF
    Whether for commuting or leisure, cycling is a growing transport mode in many cities worldwide. However, it is still perceived as a dangerous activity. Although serious incidents related to cycling leading to major injuries are rare, the fear of getting hit or falling hinders the expansion of cycling as a major transport mode. Indeed, it has been shown that focusing on serious injuries only touches the tip of the iceberg. Near miss data can provide much more information about potential problems and how to avoid risky situations that may lead to serious incidents. Unfortunately, there is a gap in the knowledge in identifying and analysing near misses. This hinders drawing statistically significant conclusions to provide measures for the built-environment that ensure a safer environment for people on bikes. In this research, we develop a method to detect and analyse near misses and their risk factors using artificial intelligence. This is accomplished by analysing video streams linked to near miss incidents within a novel framework relying on deep learning and computer vision. This framework automatically detects near misses and extracts their risk factors from video streams before analysing their statistical significance. It also provides practical solutions implemented in a camera with embedded AI (URBAN-i Box) and a cloud-based service (URBAN-i Cloud) to tackle the stated issue in the real-world settings for use by researchers, policy-makers, or citizens. The research aims to provide human-centred evidence that may enable policy-makers and planners to provide a safer built environment for cycling in London, or elsewhere. More broadly, this research aims to contribute to the scientific literature with the theoretical and empirical foundations of a computer vision system that can be utilised for detecting and analysing other critical events in a complex environment. Such a system can be applied to a wide range of events, such as traffic incidents, crime or overcrowding

    Analyzing Sanctioned Suicide: a case study on pro-choice suicide sites

    Get PDF
    According to the World Health Organization, close to 700’000 people take their own lives every year. Suicide has always been a socially important topic, so much so that free hotlines, help bots and automatic banners are displayed and easily accessible to people that search related keywords on the web. In the last year, it has come to light the existence of Sanctioned Suicide, a pro-choice forum discussing suicide, where users can both look for help with their recovery or research and asks questions about methods and how to acquire them. These types of sites have yet to be extensively researched in the literature. Their analysis could allow us to better understand what are the topics discussed and how these communities act, very useful knowledge for suicide prevention and help of suicidal individuals. In this thesis, we use Sanctioned Suicide as a case study and investigate how it is organized, what knowledge can be found and how users communicate in this environment. We have collected data for a total of 53K threads, 700K comments and 16K users. We use this dataset to analyze user trends, extract the topics of conversation in the forum and uncover hidden relations. Our analyses show that 30% of the topics found in Sanctioned Suicide discussions deal with suicide methods. We also discover that Covid has been a distress factor for users, especially during the first lockdown, highlighting a strong connection between talks of suicide and Covid.According to the World Health Organization, close to 700’000 people take their own lives every year. Suicide has always been a socially important topic, so much so that free hotlines, help bots and automatic banners are displayed and easily accessible to people that search related keywords on the web. In the last year, it has come to light the existence of Sanctioned Suicide, a pro-choice forum discussing suicide, where users can both look for help with their recovery or research and asks questions about methods and how to acquire them. These types of sites have yet to be extensively researched in the literature. Their analysis could allow us to better understand what are the topics discussed and how these communities act, very useful knowledge for suicide prevention and help of suicidal individuals. In this thesis, we use Sanctioned Suicide as a case study and investigate how it is organized, what knowledge can be found and how users communicate in this environment. We have collected data for a total of 53K threads, 700K comments and 16K users. We use this dataset to analyze user trends, extract the topics of conversation in the forum and uncover hidden relations. Our analyses show that 30% of the topics found in Sanctioned Suicide discussions deal with suicide methods. We also discover that Covid has been a distress factor for users, especially during the first lockdown, highlighting a strong connection between talks of suicide and Covid

    Subspace Representations and Learning for Visual Recognition

    Get PDF
    Pervasive and affordable sensor and storage technology enables the acquisition of an ever-rising amount of visual data. The ability to extract semantic information by interpreting, indexing and searching visual data is impacting domains such as surveillance, robotics, intelligence, human- computer interaction, navigation, healthcare, and several others. This further stimulates the investigation of automated extraction techniques that are more efficient, and robust against the many sources of noise affecting the already complex visual data, which is carrying the semantic information of interest. We address the problem by designing novel visual data representations, based on learning data subspace decompositions that are invariant against noise, while being informative for the task at hand. We use this guiding principle to tackle several visual recognition problems, including detection and recognition of human interactions from surveillance video, face recognition in unconstrained environments, and domain generalization for object recognition.;By interpreting visual data with a simple additive noise model, we consider the subspaces spanned by the model portion (model subspace) and the noise portion (variation subspace). We observe that decomposing the variation subspace against the model subspace gives rise to the so-called parity subspace. Decomposing the model subspace against the variation subspace instead gives rise to what we name invariant subspace. We extend the use of kernel techniques for the parity subspace. This enables modeling the highly non-linear temporal trajectories describing human behavior, and performing detection and recognition of human interactions. In addition, we introduce supervised low-rank matrix decomposition techniques for learning the invariant subspace for two other tasks. We learn invariant representations for face recognition from grossly corrupted images, and we learn object recognition classifiers that are invariant to the so-called domain bias.;Extensive experiments using the benchmark datasets publicly available for each of the three tasks, show that learning representations based on subspace decompositions invariant to the sources of noise lead to results comparable or better than the state-of-the-art
    corecore