4 research outputs found
Advancements In Crowd-Monitoring System: A Comprehensive Analysis of Systematic Approaches and Automation Algorithms: State-of-The-Art
Growing apprehensions surrounding public safety have captured the attention
of numerous governments and security agencies across the globe. These entities
are increasingly acknowledging the imperative need for reliable and secure
crowd-monitoring systems to address these concerns. Effectively managing human
gatherings necessitates proactive measures to prevent unforeseen events or
complications, ensuring a safe and well-coordinated environment. The scarcity
of research focusing on crowd monitoring systems and their security
implications has given rise to a burgeoning area of investigation, exploring
potential approaches to safeguard human congregations effectively. Crowd
monitoring systems depend on a bifurcated approach, encompassing vision-based
and non-vision-based technologies. An in-depth analysis of these two
methodologies will be conducted in this research. The efficacy of these
approaches is contingent upon the specific environment and temporal context in
which they are deployed, as they each offer distinct advantages. This paper
endeavors to present an in-depth analysis of the recent incorporation of
artificial intelligence (AI) algorithms and models into automated systems,
emphasizing their contemporary applications and effectiveness in various
contexts
Audio-visual multi-modality driven hybrid feature learning model for crowd analysis and classification
The high pace emergence in advanced software systems, low-cost hardware and decentralized cloud computing technologies have broadened the horizon for vision-based surveillance, monitoring and control. However, complex and inferior feature learning over visual artefacts or video streams, especially under extreme conditions confine majority of the at-hand vision-based crowd analysis and classification systems. Retrieving event-sensitive or crowd-type sensitive spatio-temporal features for the different crowd types under extreme conditions is a highly complex task. Consequently, it results in lower accuracy and hence low reliability that confines existing methods for real-time crowd analysis. Despite numerous efforts in vision-based approaches, the lack of acoustic cues often creates ambiguity in crowd classification. On the other hand, the strategic amalgamation of audio-visual features can enable accurate and reliable crowd analysis and classification. Considering it as motivation, in this research a novel audio-visual multi-modality driven hybrid feature learning model is developed for crowd analysis and classification. In this work, a hybrid feature extraction model was applied to extract deep spatio-temporal features by using Gray-Level Co-occurrence Metrics (GLCM) and AlexNet transferrable learning model. Once extracting the different GLCM features and AlexNet deep features, horizontal concatenation was done to fuse the different feature sets. Similarly, for acoustic feature extraction, the audio samples (from the input video) were processed for static (fixed size) sampling, pre-emphasis, block framing and Hann windowing, followed by acoustic feature extraction like GTCC, GTCC-Delta, GTCC-Delta-Delta, MFCC, Spectral Entropy, Spectral Flux, Spectral Slope and Harmonics to Noise Ratio (HNR). Finally, the extracted audio-visual features were fused to yield a composite multi-modal feature set, which is processed for classification using the random forest ensemble classifier. The multi-class classification yields a crowd-classification accurac12529y of (98.26%), precision (98.89%), sensitivity (94.82%), specificity (95.57%), and F-Measure of 98.84%. The robustness of the proposed multi-modality-based crowd analysis model confirms its suitability towards real-world crowd detection and classification tasks