Search CORE

150 research outputs found

Real-World Anomaly Detection in Video Using Spatio-Temporal Features Analysis for Weakly Labelled Data with Auto Label Generation

Author: Chaudhari Jitendra P.
Nayak Rikin
Publication venue: Faculty of Electrical Engineering, J.J. Strossmayer University of Osijek
Publication date: 01/01/2023
Field of study

Detecting anomalies in videos is a complex task due to diverse content, noisy labeling, and a lack of frame-level labeling. To address these challenges in weakly labeled datasets, we propose a novel custom loss function in conjunction with the multi-instance learning (MIL) algorithm. Our approach utilizes the UCF Crime and ShanghaiTech datasets for anomaly detection. The UCF Crime dataset includes labeled videos depicting a range of incidents such as explosions, assaults, and burglaries, while the ShanghaiTech dataset is one of the largest anomaly datasets, with over 400 video clips featuring three different scenes and 130 abnormal events. We generated pseudo labels for videos using the MIL technique to detect frame-level anomalies from video-level annotations, and to train the network to distinguish between normal and abnormal classes. We conducted extensive experiments on the UCF Crime dataset using C3D and I3D features to test our model\u27s performance. For the ShanghaiTech dataset, we used I3D features for training and testing. Our results show that with I3D features, we achieve an 84.6% frame-level AUC score for the UCF Crime dataset and a 92.27% frame-level AUC score for the ShanghaiTech dataset, which are comparable to other methods used for similar datasets

Efficient activity recognition using lightweight CNN and DS-GRU network for surveillance applications

Author: Baik Sung Wook
Ding Weiping
Haq Ijaz Ul
Muhammad Khan
Palade Vasile
Ullah Amin
Publication venue: 'Elsevier BV'
Publication date: 01/05/2021
Field of study

テキタイテキセイセイネットワークヲモチイタカンシカメラエイゾウニタイスルイジョウケンチシュホウニカンスルケンキュウ

Author: Saypadith Savath
サイパディスサバス
Publication venue
Publication date
Field of study

RecapNet: Action Proposal Generation Mimicking Human Cognitive Process

Author: Chen Yang
Li Yong
Lin Zhiwei
Snoussi Hichem
Wang Hui
Wang Tian
Zhu Aichun
Publication venue
Publication date: 01/01/2020
Field of study

International audienceGenerating action proposals in untrimmed videos is a challenging task, since video sequences usually contain lots of irrelevant contents and the duration of an action instance is arbitrary. The quality of action proposals is key to action detection performance. The previous methods mainly rely on sliding windows or anchor boxes to cover all ground-truth actions, but this is infeasible and computationally inefficient. To this end, this article proposes a RecapNet--a novel framework for generating action proposal, by mimicking the human cognitive process of understanding video content. Specifically, this RecapNet includes a residual causal convolution module to build a short memory of the past events, based on which the joint probability actionness density ranking mechanism is designed to retrieve the action proposals. The RecapNet can handle videos with arbitrary length and more important, a video sequence will need to be processed only in one single pass in order to generate all action proposals. The experiments show that the proposed RecapNet outperforms the state of the art under all metrics on the benchmark THUMOS14 and ActivityNet-1.3 datasets. The code is available publicly at https://github.com/tianwangbuaa/RecapNet

Ulster University's Research Portal

Hal-Diderot

Recommended from our members

View-invariant gait person re-identification with spatial and temporal attention

Author: Rahi Babak
Publication venue: Brunel University London
Publication date: 01/01/2021
Field of study

This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonPerson re-identification at a distance across multiple none overlapping cameras has been an active research area for years. In the past ten years, Short term Person Re-Id techniques have made great strides in terms of accuracy using only appearance features in limited environments. However, massive intraclass variations and inter-class confusion limit their ability to be used in practical applications. Moreover, appearance consistency can only be assumed in a short time span from one camera to the other. Since the holistic appearance will change drastically over days and weeks, the technique, as mentioned above, will be ineffective. Practical applications usually require a long-term solution in which the subject appearance and clothing might have changed after a significant period has elapsed. Facing these problems, soft biometric features such as Gait have been proposed in the past. Nevertheless, even Gait can vary with illness, ageing and changes in the emotional state, changes in walking surfaces, shoe type, clothes type, objects carried by the subject and even clutter in the scene. Therefore, Gait is considered a temporal cue that could provide biometric motion information. On the other hand, the shape of the human body could be viewed as a spatial signal which can produce valuable information. So, extracting discriminative features from both spatial and temporal domains would be very beneficial to this research. Therefore, this thesis focuses on finding the best and most robust method to tackle the gait human Re-identification problem and solve it for practical applications. In real-world surveillance scenarios, the human gait cycle is primarily abnormal. These abnormalities include but not limited to temporal and spatial characteristics changes such as walking speed, broken gait phase and most importantly, varied camera angles. Our work performed an extensive literature study on spatial and temporal gait feature extraction methods with a focus on deep learning. Next, we conducted a comparative study and proposed a spatial-temporal approach for gait feature extraction using the fusion of multiple modalities, including optical-flow, raw silhouettes and RGB images. This approach was tested on two of the most challenging publicly available datasets for gait recognition TUM-GAID and CASIA-B, with excellent results presented in chapter 3. Furthermore, a modern spatial-temporal attention mechanism was proposed and tested on CASIA-B and OULP datasets which learns salient features independent of the gait cycle and view variations. The spatial attention layer in the proposed method extracts the spatial feature maps using a two-layered architecture that are fused using late fusion. It can pay attention to the identity-related salient regions in silhouette sequences discriminatively using the spatial feature maps. The temporal attention layer consists of an LSTM that encodes the temporal motion for silhouette sequences. It uses the encoded output vectors in the temporal attention architecture to focus on the most critical timesteps in the gait cycle and discard the rest. Furthermore, we improved the performance of our method by mapping our extracted spatial-temporal gait features to a discriminative null space for use in our Siamese architecture for crossmatching. We also conducted an element removal experiment on each segment of our spatial-temporal attentional network to gain insight into each component’s contribution to the performance. Our method showed outstanding robustness against abnormal gait cycles as well as viewpoint variations on both benchmark datasets

Brunel University Research Archive

Re-Identification in Urban Scenarios: A Review of Tools and Methods

Author: Hugo S. Oliveira
José J. M. Machado
João Manuel R. S. Tavares
Publication venue: 'MDPI AG'
Publication date: 01/11/2021
Field of study

With the widespread use of surveillance image cameras and enhanced awareness of public security, objects, and persons Re-Identification (ReID), the task of recognizing objects in non-overlapping camera networks has attracted particular attention in computer vision and pattern recognition communities. Given an image or video of an object-of-interest (query), object identification aims to identify the object from images or video feed taken from different cameras. After many years of great effort, object ReID remains a notably challenging task. The main reason is that an object's appearance may dramatically change across camera views due to significant variations in illumination, poses or viewpoints, or even cluttered backgrounds. With the advent of Deep Neural Networks (DNN), there have been many proposals for different network architectures achieving high-performance levels. With the aim of identifying the most promising methods for ReID for future robust implementations, a review study is presented, mainly focusing on the person and multi-object ReID and auxiliary methods for image enhancement. Such methods are crucial for robust object ReID, while highlighting limitations of the identified methods. This is a very active field, evidenced by the dates of the publications found. However, most works use data from very different datasets and genres, which presents an obstacle to wide generalized DNN model training and usage. Although the model's performance has achieved satisfactory results on particular datasets, a particular trend was observed in the use of 3D Convolutional Neural Networks (CNN), attention mechanisms to capture object-relevant features, and generative adversarial training to overcome data limitations. However, there is still room for improvement, namely in using images from urban scenarios among anonymized images to comply with public privacy legislation. The main challenges that remain in the ReID field, and prospects for future research directions towards ReID in dense urban scenarios, are also discussed