861 research outputs found

    An Enhanced Spatio-Temporal Human Detected Keyframe Extraction

    Get PDF
    Due to the immense availability of Closed-Circuit Television surveillance, it is quite difficult for crime investigation due to its huge storage and complex background. Content-based video retrieval is an excellent method to identify the best Keyframes from these surveillance videos. As the crime surveillance reports numerous action scenes, the existing keyframe extraction is not exemplary. At this point, the Spatio-temporal Histogram of Oriented Gradients - Support Vector Machine feature method with the combination of Background Subtraction is appended over the recovered crime video to highlight the human presence in surveillance frames. Additionally, the Visual Geometry Group trains these frames for the classification report of human-detected frames. These detected frames are processed to extract the keyframe by manipulating an inter-frame difference with its threshold value to favor the requisite human-detected keyframes. Thus, the experimental results of HOG-SVM illustrate a compression ratio of 98.54%, which is preferable to the proposed work\u27s compression ratio of 98.71%, which supports the criminal investigation

    AI-Based Analytics for Hawkers Identification in Video Surveillance for Smart Community

    Get PDF
    Street hawking is a widespread phenomenon in urban areas globally, presenting challenges for local authorities such as traffic congestion, waste management, and negative impacts on the city's image. This research addresses key issues faced by authorities in managing hawkers, including the resistance to formalization, maintaining urban aesthetics, waste disposal, and understanding user preferences. The study investigates the performance of the You Only Look Once (YOLO) algorithm, utilizing Convolutional Neural Networks (CNN) for real-time object detection. To achieve thisobjective, the YOLOv5 algorithm is trained with a custom image dataset collected from the same camera along the street in the city area to detect five classes of objects, namely umbrella, table, stool, car, and people. Real images that were captured via camera and video surveillance were compiled as datasets which are then used to train and test the algorithm. The study aims to provide insights into the data collection process of hawkers along the street around the areas and the development of real-time hawker detection for the smart city application

    Machine learning-based human observer analysis of video sequences

    Get PDF
    The research contributes to the field of video analysis by proposing novel approaches to automatically generating human observer performance patterns that can be effectively used in advancing the modern video analytic and forensic algorithms. Eye tracker and eye movement analysis technology are employed in medical research, psychology, cognitive science and advertising. The data collected on human eye movement from the eye tracker can be analyzed using the machine and statistical learning approaches. Therefore, the study attempts to understand the visual attention pattern of people when observing a captured CCTV footage. It intends to prove whether the eye gaze of the observer which determines their behaviour is dependent on the given instructions or the knowledge they learn from the surveillance task. The research attempts to understand whether the attention of the observer on human objects is differently identified and tracked considering the different areas of the body of the tracked object. It attempts to know whether pattern analysis and machine learning can effectively replace the current conceptual and statistical approaches to the analysis of eye-tracking data captured within a CCTV surveillance task. A pilot study was employed that took around 30 minutes for each participant. It involved observing 13 different pre-recorded CCTV clips of public space. The participants are provided with a clear written description of the targets they should find in each video. The study included a total of 24 participants with varying levels of experience in analyzing CCTV video. A Tobii eye tracking system was employed to record the eye movements of the participants. The data captured by the eye tracking sensor is analyzed using statistical data analysis approaches like SPSS and machine learning algorithms using WEKA. The research concluded the existence of differences in behavioural patterns which could be used to classify participants of study is appropriate machine learning algorithms are employed. The research conducted on video analytics was perceived to be limited to few iii projects where the human object being observed was viewed as one object, and hence the detailed analysis of human observer attention pattern based on human body part articulation has not been investigated. All previous attempts in human observer visual attention pattern analysis on CCTV video analytics and forensics either used conceptual or statistical approaches. These methods were limited with regards to making predictions and the detection of hidden patterns. A novel approach to articulating human objects to be identified and tracked in a visual surveillance task led to constrained results, which demanded the use of advanced machine learning algorithms for classification of participants The research conducted within the context of this thesis resulted in several practical data collection and analysis challenges during formal CCTV operator based surveillance tasks. These made it difficult to obtain the appropriate cooperation from the expert operators of CCTV for data collection. Therefore, if expert operators were employed in the study rather than novice operator, a more discriminative and accurate classification would have been achieved. Machine learning approaches like ensemble learning and tree based algorithms can be applied in cases where a more detailed analysis of the human behaviour is needed. Traditional machine learning approaches are challenged by recent advances in the field of convolutional neural networks and deep learning. Therefore, future research can replace the traditional machine learning approaches employed in this study, with convolutional neural networks. The current research was limited to 13 different videos with different descriptions given to the participants for identifying and tracking different individuals. The research can be expanded to include any complicated demands with regards to changes in the analysis process

    Applying psychological science to the CCTV review process: a review of cognitive and ergonomic literature

    Get PDF
    As CCTV cameras are used more and more often to increase security in communities, police are spending a larger proportion of their resources, including time, in processing CCTV images when investigating crimes that have occurred (Levesley & Martin, 2005; Nichols, 2001). As with all tasks, there are ways to approach this task that will facilitate performance and other approaches that will degrade performance, either by increasing errors or by unnecessarily prolonging the process. A clearer understanding of psychological factors influencing the effectiveness of footage review will facilitate future training in best practice with respect to the review of CCTV footage. The goal of this report is to provide such understanding by reviewing research on footage review, research on related tasks that require similar skills, and experimental laboratory research about the cognitive skills underpinning the task. The report is organised to address five challenges to effectiveness of CCTV review: the effects of the degraded nature of CCTV footage, distractions and interrupts, the length of the task, inappropriate mindset, and variability in people’s abilities and experience. Recommendations for optimising CCTV footage review include (1) doing a cognitive task analysis to increase understanding of the ways in which performance might be limited, (2) exploiting technology advances to maximise the perceptual quality of the footage (3) training people to improve the flexibility of their mindset as they perceive and interpret the images seen, (4) monitoring performance either on an ongoing basis, by using psychophysiological measures of alertness, or periodically, by testing screeners’ ability to find evidence in footage developed for such testing, and (5) evaluating the relevance of possible selection tests to screen effective from ineffective screener

    Deep-Facial Feature-Based Person Reidentification for Authentication in Surveillance Applications

    Get PDF
    Person reidentification (Re-ID) has been a problem recently faced in computer vision. Most of the existing methods focus on body features which are captured in the scene with high-end surveillance system. However, it is unhelpful for authentication. The technology came up empty in surveillance scenario such as in London’s subway bomb blast, and Bangalore ATM brutal attack cases, even though the suspected images exist in official databases. Hence, the prime objective of this chapter is to develop an efficient facial feature-based person reidentification framework for controlled scenario to authenticate a person. Initially, faces are detected by faster region-based convolutional neural network (Faster R-CNN). Subsequently, landmark points are obtained using supervised descent method (SDM) algorithm, and the face is recognized, by the joint Bayesian model. Each image is given an ID in the training database. Based on their similarity with the query image, it is ranked with the Re-ID index. The proposed framework overcomes the challenges such as pose variations, low resolution, and partial occlusions (mask and goggles). The experimental results (accuracy) on benchmark dataset demonstrate the effectiveness of the proposed method which is inferred from the observation of receiver operating characteristic (ROC) curve and cumulative matching characteristics (CMC) curve

    Pedestrian Detection and Tracking in Video Surveillance System: Issues, Comprehensive Review, and Challenges

    Get PDF
    Pedestrian detection and monitoring in a surveillance system are critical for numerous utility areas which encompass unusual event detection, human gait, congestion or crowded vicinity evaluation, gender classification, fall detection in elderly humans, etc. Researchers’ primary focus is to develop surveillance system that can work in a dynamic environment, but there are major issues and challenges involved in designing such systems. These challenges occur at three different levels of pedestrian detection, viz. video acquisition, human detection, and its tracking. The challenges in acquiring video are, viz. illumination variation, abrupt motion, complex background, shadows, object deformation, etc. Human detection and tracking challenges are varied poses, occlusion, crowd density area tracking, etc. These results in lower recognition rate. A brief summary of surveillance system along with comparisons of pedestrian detection and tracking technique in video surveillance is presented in this chapter. The publicly available pedestrian benchmark databases as well as the future research directions on pedestrian detection have also been discussed

    Sports Analytics With Computer Vision

    Get PDF
    Computer vision in sports analytics is a relatively new development. With multi-million dollar systems like STATS’s SportVu, professional basketball teams are able to collect extremely fine-detailed data better than ever before. This concept can be scaled down to provide similar statistics collection to college and high school basketball teams. Here we investigate the creation of such a system using open-source technologies and less expensive hardware. In addition, using a similar technology, we examine basketball free throws to see whether a shooter’s form has a specific relationship to a shot’s outcome. A system that learns this relationship could be used to provide feedback on a player’s shooting form

    Use of Coherent Point Drift in computer vision applications

    Get PDF
    This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used. The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform. The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images. The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration
    • 

    corecore