118 research outputs found
Carried baggage detection and recognition in video surveillance with foreground segmentation
Security cameras installed in public spaces or in private organizations continuously
record video data with the aim of detecting and preventing crime. For that reason,
video content analysis applications, either for real time (i.e. analytic) or post-event
(i.e. forensic) analysis, have gained high interest in recent years. In this thesis,
the primary focus is on two key aspects of video analysis, reliable moving object
segmentation and carried object detection & identification.
A novel moving object segmentation scheme by background subtraction is presented
in this thesis. The scheme relies on background modelling which is based
on multi-directional gradient and phase congruency. As a post processing step,
the detected foreground contours are refined by classifying the edge segments as
either belonging to the foreground or background. Further contour completion
technique by anisotropic diffusion is first introduced in this area. The proposed
method targets cast shadow removal, gradual illumination change invariance, and
closed contour extraction.
A state of the art carried object detection method is employed as a benchmark
algorithm. This method includes silhouette analysis by comparing human temporal
templates with unencumbered human models. The implementation aspects of
the algorithm are improved by automatically estimating the viewing direction of
the pedestrian and are extended by a carried luggage identification module. As
the temporal template is a frequency template and the information that it provides
is not sufficient, a colour temporal template is introduced. The standard
steps followed by the state of the art algorithm are approached from a different
extended (by colour information) perspective, resulting in more accurate carried
object segmentation.
The experiments conducted in this research show that the proposed closed
foreground segmentation technique attains all the aforementioned goals. The incremental
improvements applied to the state of the art carried object detection
algorithm revealed the full potential of the scheme. The experiments demonstrate
the ability of the proposed carried object detection algorithm to supersede the
state of the art method
Gait recognition for person re-identification
Person re-identification across multiple cameras is an essential task in computer vision applications, particularly tracking the same person in different scenes. Gait recognition, which is the recognition based on the walking style, is mostly used for this purpose due to that human gait has unique characteristics that allow recognizing a person from a distance. However, human recognition via gait technique could be limited with the position of captured images or videos. Hence, this paper proposes a gait recognition approach for person re-identification. The proposed approach starts with estimating the angle of the gait first, and this is then followed with the recognition process, which is performed using convolutional neural networks. Herein, multitask convolutional neural network models and extracted gait energy images (GEIs) are used to estimate the angle and recognize the gait. GEIs are extracted by first detecting the moving objects, using background subtraction techniques. Training and testing phases are applied to the following three recognized datasets: CASIA-(B), OU-ISIR, and OU-MVLP. The proposed method is evaluated for background modeling using the Scene Background Modeling and Initialization (SBI) dataset. The proposed gait recognition method showed an accuracy of more than 98% for almost all datasets. Results of the proposed approach showed higher accuracy compared to obtained results of other methods result for CASIA-(B) and OU-MVLP and form the best results for the OU-ISIR dataset
RGB-D And Thermal Sensor Fusion: A Systematic Literature Review
In the last decade, the computer vision field has seen significant progress
in multimodal data fusion and learning, where multiple sensors, including
depth, infrared, and visual, are used to capture the environment across diverse
spectral ranges. Despite these advancements, there has been no systematic and
comprehensive evaluation of fusing RGB-D and thermal modalities to date. While
autonomous driving using LiDAR, radar, RGB, and other sensors has garnered
substantial research interest, along with the fusion of RGB and depth
modalities, the integration of thermal cameras and, specifically, the fusion of
RGB-D and thermal data, has received comparatively less attention. This might
be partly due to the limited number of publicly available datasets for such
applications. This paper provides a comprehensive review of both,
state-of-the-art and traditional methods used in fusing RGB-D and thermal
camera data for various applications, such as site inspection, human tracking,
fault detection, and others. The reviewed literature has been categorised into
technical areas, such as 3D reconstruction, segmentation, object detection,
available datasets, and other related topics. Following a brief introduction
and an overview of the methodology, the study delves into calibration and
registration techniques, then examines thermal visualisation and 3D
reconstruction, before discussing the application of classic feature-based
techniques as well as modern deep learning approaches. The paper concludes with
a discourse on current limitations and potential future research directions. It
is hoped that this survey will serve as a valuable reference for researchers
looking to familiarise themselves with the latest advancements and contribute
to the RGB-DT research field.Comment: 33 pages, 20 figure
Recommended from our members
Automated Detection and Counting of Pedestrians on an Urban Roadside
This thesis implements an automated system that counts pedestrians with 85% accuracy. Two approaches have been considered and evaluated in terms of count accuracy, cost and ease of deployment. The first approach employs the Autoscope Solo Terra, a traffic camera which is widely used to monitor vehicular traffic. The Solo Terra supports an image processing-based detector that counts the number of objects crossing user-defined areas in the captured image. The count is updated based on the amount of movement across the selected regions. Therefore, a second approach has been considered that uses a histogram of oriented gradients (HoG), an advanced vision based algorithm proposed by Dalal et al. which distinguishes a pedestrian from a non-pedestrian based on an omega shape formed by the head and shoulders of a human being. The implemented detection software processes video frames that are streamed from a low-cost digital camera. The frames are divided into sub-regions which are scanned for an omega shape whenever movement is detected in those regions. It has been found that the HoG-based approach degrades in performance due to occlusion under dense pedestrian traffic conditions whereas the Solo Terra approach appears to be more robust. Undercounts and overcounts were encountered using the Solo Terra approach. To combat the disadvantages of both the approaches, they were integrated to form a single system where count is incremented predominantly using the Solo Terra. The HoG-based approach corrects the obtained count under certain conditions. A preliminary prototype of the integrated system has been verified
Biometric Systems
Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study
Machine Learning Methods with Noisy, Incomplete or Small Datasets
In many machine learning applications, available datasets are sometimes incomplete, noisy or affected by artifacts. In supervised scenarios, it could happen that label information has low quality, which might include unbalanced training sets, noisy labels and other problems. Moreover, in practice, it is very common that available data samples are not enough to derive useful supervised or unsupervised classifiers. All these issues are commonly referred to as the low-quality data problem. This book collects novel contributions on machine learning methods for low-quality datasets, to contribute to the dissemination of new ideas to solve this challenging problem, and to provide clear examples of application in real scenarios
Recommended from our members
Video content analysis for automated detection and tracking of humans in CCTV surveillance applications
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The problems of achieving high detection rate with low false alarm rate for human detection and tracking in video sequence, performance scalability, and improving response time are addressed in this thesis. The underlying causes are the effect of scene complexity, human-to-human interactions, scale changes, and scene background-human interactions. A two-stage processing solution, namely, human detection, and human tracking with two novel pattern classifiers is presented. Scale independent human detection is achieved by processing in the wavelet domain using square wavelet features. These features used to characterise human silhouettes at different scales are similar to rectangular features used in [Viola 2001]. At the detection stage two detectors are combined to improve detection rate. The first detector is based on shape-outline of humans extracted from the scene using a reduced complexity outline extraction algorithm. A Shape mismatch measure is used to differentiate between the human and the background class. The second detector uses rectangular features as primitives for silhouette description in the wavelet domain. The marginal distribution of features collocated at a particular position on a candidate human (a patch of the image) is used to describe statistically the silhouette. Two similarity measures are computed between a candidate human and the model histograms of human and non human classes. The similarity measure is used to discriminate between the human and the non human class. At the tracking stage, a tracker based on joint probabilistic data association filter (JPDAF) for data association, and motion correspondence is presented. Track clustering is used to reduce hypothesis enumeration complexity. Towards improving response time with increase in frame dimension, scene complexity, and number of channels; a scalable algorithmic architecture and operating accuracy prediction technique is presented. A scheduling strategy for improving the response time and throughput by parallel processing is also presented
- …