599 research outputs found

    Illumination invariant stationary object detection

    Get PDF
    A real-time system for the detection and tracking of moving objects that becomes stationary in a restricted zone. A new pixel classification method based on the segmentation history image is used to identify stationary objects in the scene. These objects are then tracked using a novel adaptive edge orientation-based tracking method. Experimental results have shown that the tracking technique gives more than a 95% detection success rate, even if objects are partially occluded. The tracking results, together with the historic edge maps, are analysed to remove objects that are no longer stationary or are falsely identified as foreground regions because of sudden changes in the illumination conditions. The technique has been tested on over 7 h of video recorded at different locations and time of day, both outdoors and indoors. The results obtained are compared with other available state-of-the-art methods

    Background Subtraction in Video Surveillance

    Get PDF
    The aim of thesis is the real-time detection of moving and unconstrained surveillance environments monitored with static cameras. This is achieved based on the results provided by background subtraction. For this task, Gaussian Mixture Models (GMMs) and Kernel density estimation (KDE) are used. A thorough review of state-of-the-art formulations for the use of GMMs and KDE in the task of background subtraction reveals some further development opportunities, which are tackled in a novel GMM-based approach incorporating a variance controlling scheme. The proposed approach method is for parametric and non-parametric and gives us the better method for background subtraction, with more accuracy and easier parametrization of the models, for different environments. It also converges to more accurate models of the scenes. The detection of moving objects is achieved by using the results of background subtraction. For the detection of new static objects, two background models, learning at different rates, are used. This allows for a multi-class pixel classification, which follows the temporality of the changes detected by means of background subtraction. In a first approach, the subtraction of background models is done for parametric model and their results are shown. The second approach is for non-parametric models, where background subtraction is done using KDE non-parametric model. Furthermore, we have done some video engineering, where the background subtraction algorithm was employed so that, the background from one video and the foreground from another video are merged to form a new video. By doing this way, we can also do more complex video engineering with multiple videos. Finally, the results provided by region analysis can be used to improve the quality of the background models, therefore, considerably improving the detection results

    An improved Gaussian Mixture Model with post-processing for multiple object detection in surveillance video analytics

    Get PDF
    Gaussian Mixture Model (GMM) is an effective method for extracting foreground objects from video sequences. However, GMM fails to detect the object in challenging scenarios like the presence of shadow, occlusion, complex backgrounds, etc. To handle these challenges, intrinsic and extrinsic enhancement is required in traditional GMM. This paper presents a novel framework that combines improved GMM with postprocessing for multiple object detection. In the proposed system, GMM with parameter initialization is considered an intrinsic improvement. Video preprocessing and postprocessing are considered extrinsic improvements. Integration of morphological operation with GMM helps for better segmentation than traditional GMM, and it also helps to increase detection performance by reducing false positives. Video preprocessing is the process of noise removal that prepares input video ready for further processing. In the final step gradient of morphological operations is used for postprocessing. The proposed approach was tested on challenging surveillance video sequences from benchmark datasets such as PETS 2009 and CD 2014(Change Detection). The experimental results are compared using ground truth and performance evaluation metrics. The results show that the proposed approach performs better than GMM, and the method can detect the object effectively even in illumination variation and partial occlusion

    Video analytics for security systems

    Get PDF
    This study has been conducted to develop robust event detection and object tracking algorithms that can be implemented in real time video surveillance applications. The aim of the research has been to produce an automated video surveillance system that is able to detect and report potential security risks with minimum human intervention. Since the algorithms are designed to be implemented in real-life scenarios, they must be able to cope with strong illumination changes and occlusions. The thesis is divided into two major sections. The first section deals with event detection and edge based tracking while the second section describes colour measurement methods developed to track objects in crowded environments. The event detection methods presented in the thesis mainly focus on detection and tracking of objects that become stationary in the scene. Objects such as baggage left in public places or vehicles parked illegally can cause a serious security threat. A new pixel based classification technique has been developed to detect objects of this type in cluttered scenes. Once detected, edge based object descriptors are obtained and stored as templates for tracking purposes. The consistency of these descriptors is examined using an adaptive edge orientation based technique. Objects are tracked and alarm events are generated if the objects are found to be stationary in the scene after a certain period of time. To evaluate the full capabilities of the pixel based classification and adaptive edge orientation based tracking methods, the model is tested using several hours of real-life video surveillance scenarios recorded at different locations and time of day from our own and publically available databases (i-LIDS, PETS, MIT, ViSOR). The performance results demonstrate that the combination of pixel based classification and adaptive edge orientation based tracking gave over 95% success rate. The results obtained also yield better detection and tracking results when compared with the other available state of the art methods. In the second part of the thesis, colour based techniques are used to track objects in crowded video sequences in circumstances of severe occlusion. A novel Adaptive Sample Count Particle Filter (ASCPF) technique is presented that improves the performance of the standard Sample Importance Resampling Particle Filter by up to 80% in terms of computational cost. An appropriate particle range is obtained for each object and the concept of adaptive samples is introduced to keep the computational cost down. The objective is to keep the number of particles to a minimum and only to increase them up to the maximum, as and when required. Variable standard deviation values for state vector elements have been exploited to cope with heavy occlusion. The technique has been tested on different video surveillance scenarios with variable object motion, strong occlusion and change in object scale. Experimental results show that the proposed method not only tracks the object with comparable accuracy to existing particle filter techniques but is up to five times faster. Tracking objects in a multi camera environment is discussed in the final part of the thesis. The ASCPF technique is deployed within a multi-camera environment to track objects across different camera views. Such environments can pose difficult challenges such as changes in object scale and colour features as the objects move from one camera view to another. Variable standard deviation values of the ASCPF have been utilized in order to cope with sudden colour and scale changes. As the object moves from one scene to another, the number of particles, together with the spread value, is increased to a maximum to reduce any effects of scale and colour change. Promising results are obtained when the ASCPF technique is tested on live feeds from four different camera views. It was found that not only did the ASCPF method result in the successful tracking of the moving object across different views but also maintained the real time frame rate due to its reduced computational cost thus indicating that the method is a potential practical solution for multi camera tracking applications

    Vision-based traffic surveys in urban environments

    Get PDF
    This paper presents a state-of-the-art, vision-based vehicle detection and type classification to perform traffic surveys from a roadside closed-circuit television camera. Vehicles are detected using background subtraction based on a Gaussian mixture model that can cope with vehicles that become stationary over a significant period of time. Vehicle silhouettes are described using a combination of shape and appearance features using an intensity-based pyramid histogram of orientation gradients (HOG). Classification is performed using a support vector machine, which is trained on a small set of hand-labeled silhouette exemplars. These exemplars are identified using a model-based preclassifier that utilizes calibrated images mapped by Google Earth to provide accurately surveyed scene geometry matched to visible image landmarks. Kalman filters track the vehicles to enable classification by majority voting over several consecutive frames. The system counts vehicles and separates them into four categories: car, van, bus, and motorcycle (including bicycles). Experiments with real-world data have been undertaken to evaluate system performance and vehicle detection rates of 96.45% and classification accuracy of 95.70% have been achieved on this data.The authors gratefully acknowledge the Royal Borough of Kingston for providing the video data. S.A. Velastin is grateful to funding received from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement nº 600371, el Ministerio de Economía y Competitividad (COFUND2013-51509) and Banco Santander

    VIDEO FOREGROUND LOCALIZATION FROM TRADITIONAL METHODS TO DEEP LEARNING

    Get PDF
    These days, detection of Visual Attention Regions (VAR), such as moving objects has become an integral part of many Computer Vision applications, viz. pattern recognition, object detection and classification, video surveillance, autonomous driving, human-machine interaction (HMI), and so forth. The moving object identification using bounding boxes has matured to the level of localizing the objects along their rigid borders and the process is called foreground localization (FGL). Over the decades, many image segmentation methodologies have been well studied, devised, and extended to suit the video FGL. Despite that, still, the problem of video foreground (FG) segmentation remains an intriguing task yet appealing due to its ill-posed nature and myriad of applications. Maintaining spatial and temporal coherence, particularly at object boundaries, persists challenging, and computationally burdensome. It even gets harder when the background possesses dynamic nature, like swaying tree branches or shimmering water body, and illumination variations, shadows cast by the moving objects, or when the video sequences have jittery frames caused by vibrating or unstable camera mounts on a surveillance post or moving robot. At the same time, in the analysis of traffic flow or human activity, the performance of an intelligent system substantially depends on its robustness of localizing the VAR, i.e., the FG. To this end, the natural question arises as what is the best way to deal with these challenges? Thus, the goal of this thesis is to investigate plausible real-time performant implementations from traditional approaches to modern-day deep learning (DL) models for FGL that can be applicable to many video content-aware applications (VCAA). It focuses mainly on improving existing methodologies through harnessing multimodal spatial and temporal cues for a delineated FGL. The first part of the dissertation is dedicated for enhancing conventional sample-based and Gaussian mixture model (GMM)-based video FGL using probability mass function (PMF), temporal median filtering, and fusing CIEDE2000 color similarity, color distortion, and illumination measures, and picking an appropriate adaptive threshold to extract the FG pixels. The subjective and objective evaluations are done to show the improvements over a number of similar conventional methods. The second part of the thesis focuses on exploiting and improving deep convolutional neural networks (DCNN) for the problem as mentioned earlier. Consequently, three models akin to encoder-decoder (EnDec) network are implemented with various innovative strategies to improve the quality of the FG segmentation. The strategies are not limited to double encoding - slow decoding feature learning, multi-view receptive field feature fusion, and incorporating spatiotemporal cues through long-shortterm memory (LSTM) units both in the subsampling and upsampling subnetworks. Experimental studies are carried out thoroughly on all conditions from baselines to challenging video sequences to prove the effectiveness of the proposed DCNNs. The analysis demonstrates that the architectural efficiency over other methods while quantitative and qualitative experiments show the competitive performance of the proposed models compared to the state-of-the-art
    corecore