8 research outputs found

    A robust abnormal behavior detection method using convolutional neural network

    Get PDF
    A behavior is considered abnormal when it is seen as unusual under certain contexts. The definition for abnormal behavior varies depending on situations. For example, people running in a field is considered normal but is deemed abnormal if it takes place in a mall. Similarly, loitering in the alleys, fighting or pushing each other in public areas are considered abnormal under specific circumstances. Abnormal behavior detection is crucial due to the increasing crime rate in the society. If an abnormal behavior can be detected earlier, tragedies can be avoided. In recent years, deep learning has been widely applied in the computer vision field and has acquired great success for human detection. In particular, Convolutional Neural Network (CNN) has shown to have achieved state-of-the-art performance in human detection. In this paper, a CNN-based abnormal behavior detection method is presented. The proposed approach automatically learns the most discriminative characteristics pertaining to human behavior from a large pool of videos containing normal and abnormal behaviors. Since the interpretation for abnormal behavior varies across contexts, extensive experiments have been carried out to assess various conditions and scopes including crowd and single person behavior detection and recognition. The proposed method represents an end-to-end solution to deal with abnormal behavior under different conditions including variations in background, number of subjects (individual, two persons or crowd), and a range of diverse unusual human activities. Experiments on five benchmark datasets validate the performance of the proposed approach

    International experience and approaches to the intellectual analysis of behavior in the e-government environment

    Get PDF
    The role of the Internet in people’s daily lives, the impact of social networks on the formation of public opinion, the spread of mobile communications, the collection of personal information in electronic information systems in the e-government environment made the problem of “behavior analysis” even more relevant. In order to improve the efficiency of the public administration process during the formation of the information society, one of the most important tasks to be performed by the government organizations is the correct assessment and prediction of citizens’ behavior and making the right decisions. The main goal of the intellectual analysis of behavior is to understand the logic of the activities of individuals and social groups. This article studies the international practice in intellectual analysis of behavior, examines the methods and algorithms used in this area, and identifies problems. Proposals are developed for the effective solution of questions on the intellectual analysis of behavior in the e-government environment. The approach we propose for intellectual analysis of behavior based on textual information consists of 4 levels: 1) primary processing, 2) document description, 3) classification of a set of documents into positive and negative classes, 4) determination of accuracy and completeness characteristics in classification. The use of semantic indicators for intellectual analysis of behavior can help conduct research with greater accuracy and effectively solve behavioral prediction problems

    Smartphone-based object recognition with embedded machine learning intelligence for unmanned aerial vehicles

    Get PDF
    Existing artificial intelligence solutions typically operate in powerful platforms with high computational resources availability. However, a growing number of emerging use cases such as those based on unmanned aerial systems (UAS) require new solutions with embedded artificial intelligence on a highly mobile platform. This paper proposes an innovative UAS that explores machine learning (ML) capabilities in a smartphone‐based mobile platform for object detection and recognition applications. A new system framework tailored to this challenging use case is designed with a customized workflow specified. Furthermore, the design of the embedded ML leverages TensorFlow, a cutting‐edge open‐source ML framework. The prototype of the system integrates all the architectural components in a fully functional system, and it is suitable for real‐world operational environments such as seek and rescue use cases. Experimental results validate the design and prototyping of the system and demonstrate an overall improved performance compared with the state of the art in terms of a wide range of metrics

    Criminal Intention Detection at Early Stages of Shoplifting Cases by Using 3D Convolutional Neural Networks

    Get PDF
    Crime generates significant losses, both human and economic. Every year, billions of dollars are lost due to attacks, crimes, and scams. Surveillance video camera networks generate vast amounts of data, and the surveillance staff cannot process all the information in real-time. Human sight has critical limitations. Among those limitations, visual focus is one of the most critical when dealing with surveillance. For example, in a surveillance room, a crime can occur in a different screen segment or on a distinct monitor, and the surveillance staff may overlook it. Our proposal focuses on shoplifting crimes by analyzing situations that an average person will consider as typical conditions, but may eventually lead to a crime. While other approaches identify the crime itself, we instead model suspicious behavior—the one that may occur before the build-up phase of a crime—by detecting precise segments of a video with a high probability of containing a shoplifting crime. By doing so, we provide the staff with more opportunities to act and prevent crime. We implemented a 3DCNN model as a video feature extractor and tested its performance on a dataset composed of daily action and shoplifting samples. The results are encouraging as the model correctly classifies suspicious behavior in most of the scenarios where it was tested. For example, when classifying suspicious behavior, the best model generated in this work obtains precision and recall values of 0.8571 and 1 in one of the test scenarios, respectively.2020-2

    Generalized Video Anomaly Event Detection: Systematic Taxonomy and Comparison of Deep Models

    Full text link
    Video Anomaly Detection (VAD) serves as a pivotal technology in the intelligent surveillance systems, enabling the temporal or spatial identification of anomalous events within videos. While existing reviews predominantly concentrate on conventional unsupervised methods, they often overlook the emergence of weakly-supervised and fully-unsupervised approaches. To address this gap, this survey extends the conventional scope of VAD beyond unsupervised methods, encompassing a broader spectrum termed Generalized Video Anomaly Event Detection (GVAED). By skillfully incorporating recent advancements rooted in diverse assumptions and learning frameworks, this survey introduces an intuitive taxonomy that seamlessly navigates through unsupervised, weakly-supervised, supervised and fully-unsupervised VAD methodologies, elucidating the distinctions and interconnections within these research trajectories. In addition, this survey facilitates prospective researchers by assembling a compilation of research resources, including public datasets, available codebases, programming tools, and pertinent literature. Furthermore, this survey quantitatively assesses model performance, delves into research challenges and directions, and outlines potential avenues for future exploration.Comment: Accepted by ACM Computing Surveys. For more information, please see our project page: https://github.com/fudanyliu/GVAE

    Human Activity Recognition (HAR) Using Wearable Sensors and Machine Learning

    Get PDF
    Humans engage in a wide range of simple and complex activities. Human Activity Recognition (HAR) is typically a classification problem in computer vision and pattern recognition, to recognize various human activities. Recent technological advancements, the miniaturization of electronic devices, and the deployment of cheaper and faster data networks have propelled environments augmented with contextual and real-time information, such as smart homes and smart cities. These context-aware environments, alongside smart wearable sensors, have opened the door to numerous opportunities for adding value and personalized services to citizens. Vision-based and sensory-based HAR find diverse applications in healthcare, surveillance, sports, event analysis, Human-Computer Interaction (HCI), rehabilitation engineering, occupational science, among others, resulting in significantly improved human safety and quality of life. Despite being an active research area for decades, HAR still faces challenges in terms of gesture complexity, computational cost on small devices, and energy consumption, as well as data annotation limitations. In this research, we investigate methods to sufficiently characterize and recognize complex human activities, with the aim to improving recognition accuracy, reducing computational cost and energy consumption, and creating a research-grade sensor data repository to advance research and collaboration. This research examines the feasibility of detecting natural human gestures in common daily activities. Specifically, we utilize smartwatch accelerometer sensor data and structured local context attributes and apply AI algorithms to determine the complex gesture activities of medication-taking, smoking, and eating. This dissertation is centered around modeling human activity and the application of machine learning techniques to implement automated detection of specific activities using accelerometer data from smartwatches. Our work stands out as the first in modeling human activity based on wearable sensors with a linguistic representation of grammar and syntax to derive clear semantics of complex activities whose alphabet comprises atomic activities. We apply machine learning to learn and predict complex human activities. We demonstrate the use of one of our unified models to recognize two activities using smartwatch: medication-taking and smoking. Another major part of this dissertation addresses the problem of HAR activity misalignment through edge-based computing at data origination points, leading to improved rapid data annotation, albeit with assumptions of subject fidelity in demarcating gesture start and end sections. Lastly, the dissertation describes a theoretical framework for the implementation of a library of shareable human activities. The results of this work can be applied in the implementation of a rich portal of usable human activity models, easily installable in handheld mobile devices such as phones or smart wearables to assist human agents in discerning daily living activities. This is akin to a social media of human gestures or capability models. The goal of such a framework is to domesticate the power of HAR into the hands of everyday users, as well as democratize the service to the public by enabling persons of special skills to share their skills or abilities through downloadable usable trained models

    Uma especialização do Yolov3 para detecção de pedestres

    Get PDF
    Orientador: David Menotti GomesDissertação (mestrado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 25/02/2019Inclui referências: p.111-117Resumo: A Detecção de Pedestres é uma tarefa da Visão Computacional que trabalha na localização de pedestres em imagens/vídeos para aplicações como assistência de direção, videomonitoramento, interfaces humanas, veículos e robôs autônomos. Progressos nestas aplicações podem se refletir na melhoria da qualidade de vida, e por isso, elas vem recebendo considerável atenção nos últimos anos. Na área de Aprendizagem de Máquina, Redes Neuronais Convolucionais Profundas têm sido utilizadas como principal ferramenta na obtenção dos melhores resultados em diversos desafios de detecção. Apesar do contínuo progresso na tarefa, ela ainda não está saturada, e há espaço para melhorias, inclusive para atingir-se o nível da acurácia humana. Há uma tendência entre os métodos de detecção em que tipicamente procuram aumentar a acurácia através do uso de modelos cada vez mais complexos, que elevam os custos computacionais, normalmente comprometendo a velocidade de detecção. A velocidade de detecção tem se revelado tão importante quanto a acurácia, monstrando impactar diretamente em tarefas como monitoramento, segurança automotiva e robótica. Neste trabalho, esta tendência é contrariada. Em uma primeira abordagem, o detector genérico de objetos de tempo-real, YOLOv3, é levado para experimentação no desafio Caltech Pedestrian Detection Benchmark, para avaliação de sua acurácia e velocidade de detecção contra os melhores trabalhos do desafio. Para conseguir isso, o YOLOv3 é movido de um domínio multiclasse (por exemplo, COCO Dataset com 80 classes) para a tarefa específica de detectar uma única classe, isto é, pedestres. Foi possível demonstrar que o YOLOv3 é mais rápido que os três melhores trabalhos do desafio, e ao mesmo tempo possui acurácia consistente. Em uma segunda abordagem, a técnica de "infusão de segmentação semântica fraca" é utilizada para modificar a rede neural do YOLOv3. Desta forma, o método apresentou uma detecção de pedestres aprimorada, sem impacto na velocidade de detecção, colocando o YOLOv3 na décima segunda posição do desafio Caltech, ficando apenas 2,94% atrás do melhor método da métrica principal. Adicionalmente, uma nova base de dados de detecção de pedestres é introduzida, sendo baseada no circuito de videomonitoramento do Parque Tecnológico Itaipu. Quase 8.000 frames compõe o dataset, oriundos de 21 câmeras, contendo mais de 30.000 pedestres divididos em 8 classes. Palavras-chave: Detecção de Pedestres, Videomonitoramento, YOLO, Caltech Pedestrian Dataset, PTI01 Pedestrian Dataset .Abstract: The Pedestrian Detection is a Computer Vision task which works on locating pedestrians in images/videos for applications like driving assistance, video surveillance, human interfaces, autonomous vehicles, and robots. Progresses on those applications are likely to enhance the quality of life, and because of that, they have been receiving considerable attention in the last years. In the Machine Learning area, Deep Convolutional Neural Networks (DCNN) have been the main tool in achieving the best results in many detection challenges. Despite the continuous progress in the task, it is not saturated yet, and there is room for improvements, even to reach the human-accuracy level. There is a common tendency between the detection methods to increase the accuracy typically by making use of every time more complex models which elevate the computational costs, normally compromising the detection speed. The detection speed has shown to be as important as the accuracy, demonstrating to have a direct impact on tasks like surveillance, automotive safety, and robotics. In this work, we go in the opposite direction of the trend. In our first approach, we bring the YOLOv3, a real-time generic object detector, for experimentation in the Caltech Pedestrian Detection Benchmark, in order to evaluate its accuracy and speed against the top works in such a challenge. To accomplish that, YOLOv3 is moved from a multiclass domain (e.g., COCO Dataset with 80 classes), to the specific task of detecting a single class, that is, pedestrians. We have demonstrated that it is faster than the top three works while having consistent accuracy. In a second approach, we propose to use the "weak semantic segmentation infusion" technique by modifying the YOLOv3's network. The method demonstrated to enhance the pedestrian detection with no impact on the detection speed, placing the YOLOv3 in the 12th position in the Caltech Benchmark, staying 2.94% behind the best method in the main metric. Additionally, we introduce a pedestrian detection dataset based on the Itaipu Technological Park's video surveillance system. Almost 8,000 thousand frames compose the dataset from 21 cameras and more than 30,000 pedestrians spread in 8 classes. Keywords: Pedestrian Detection, Video Surveillance, YOLO, Caltech Pedestrian Dataset, PTI01 Pedestrian Dataset
    corecore