807 research outputs found
Foreground Silhouette Extraction robust to Sudden Changes of background Appearance
Vision-based background subtraction algorithms model the intensity variation across time to classify a pixel as foreground. Unfortunately, such algorithms are sensitive to appearance changes of the background such as sudden changes of illumination or when videos are projected in the background. In this work, we propose an algorithm to extract foreground silhouettes without modeling the intensity variation across time. Using a camera pair, the stereo mismatch is processed to produce a dense disparity based on a Total Variation (TV) framework. Experimental results show that with sudden changes of background appearance, our proposed TV disparity-based extraction outperforms intensity-based algorithms and existing stereo-based approaches based on temporal depth variation and stereo mismatch
Vision-based traffic surveys in urban environments
This paper presents a state-of-the-art, vision-based vehicle detection and type classification to perform traffic surveys from a roadside closed-circuit television camera. Vehicles are detected using background subtraction based on a Gaussian mixture model that can cope with vehicles that become stationary over a significant period of time. Vehicle silhouettes are described using a combination of shape and appearance features using an intensity-based pyramid histogram of orientation gradients (HOG). Classification is performed using a support vector machine, which is trained on a small set of hand-labeled silhouette exemplars. These exemplars are identified using a model-based preclassifier that utilizes calibrated images mapped by Google Earth to provide accurately surveyed scene geometry matched to visible image landmarks. Kalman filters track the vehicles to enable classification by majority voting over several consecutive frames. The system counts vehicles and separates them into four categories: car, van, bus, and motorcycle (including bicycles). Experiments with real-world data have been undertaken to evaluate system performance and vehicle detection rates of 96.45% and classification accuracy of 95.70% have been achieved on this data.The authors gratefully acknowledge the Royal Borough of Kingston for providing the video data. S.A. Velastin is grateful to funding received from the Universidad Carlos III de Madrid, the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement nÂş 600371, el Ministerio de EconomĂa y Competitividad (COFUND2013-51509) and Banco Santander
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
Employing a RGB-D Sensor for Real-Time Tracking of Humans across Multiple Re-Entries in a Smart Environment
The term smart environment refers to physical
spaces equipped with sensors feeding into adaptive algorithms
that enable the environment to become sensitive and
responsive to the presence and needs of its occupants. People
with special needs, such as the elderly or disabled people,
stand to benefit most from such environments as they offer
sophisticated assistive functionalities supporting independent
living and improved safety. In a smart environment, the key
issue is to sense the location and identity of its users. In this
paper, we intend to tackle the problems of detecting and
tracking humans in a realistic home environment by exploiting
the complementary nature of (synchronized) color and depth
images produced by a low-cost consumer-level RGB-D
camera. Our system selectively feeds the complementary data
emanating from the two vision sensors to different algorithmic
modules which together implement three sequential
components: (1) object labeling based on depth data
clustering, (2) human re-entry identification based on
comparing visual signatures extracted from the color (RGB)
information, and (3) human tracking based on the fusion of
both depth and RGB data. Experimental results show that this
division of labor improves the system’s efficiency and
classification performance
Human detection in surveillance videos and its applications - a review
Detecting human beings accurately in a visual surveillance system is crucial for diverse application areas including abnormal event detection, human gait characterization, congestion analysis, person identification, gender classification and fall detection for elderly people. The first step of the detection process is to detect an object which is in motion. Object detection could be performed using background subtraction, optical flow and spatio-temporal filtering techniques. Once detected, a moving object could be classified as a human being using shape-based, texture-based or motion-based features. A comprehensive review with comparisons on available techniques for detecting human beings in surveillance videos is presented in this paper. The characteristics of few benchmark datasets as well as the future research directions on human detection have also been discussed
OPTICAL REVIEW Regular Paper Background Subtraction Based on Time-Series Clustering and Statistical Modeling
This paper proposes a robust method to detect and extract silhouettes of foreground objects from a video sequence of a static camera based on the improved background subtraction technique. The proposed method analyses statistically the pixel history as time series observations. The proposed method presents a robust technique to detect motions based on kernel density estimation. Two consecutive stages of the k-means clustering algorithm are utilized to identify the most reliable background regions and decrease the detection of false positives. Pixel and object based updating mechanism for the background model is presented to cope with challenges like gradual and sudden illumination changes, ghost appearance, non-stationary background objects, and moving objects that remain stable for more than the half of the training period. Experimental results show the efficiency and the robustness of the proposed method to detect and extract the silhouettes of moving objects in outdoor and indoor environments compared with conventional methods
Recommended from our members
Video content analysis for automated detection and tracking of humans in CCTV surveillance applications
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The problems of achieving high detection rate with low false alarm rate for human detection and tracking in video sequence, performance scalability, and improving response time are addressed in this thesis. The underlying causes are the effect of scene complexity, human-to-human interactions, scale changes, and scene background-human interactions. A two-stage processing solution, namely, human detection, and human tracking with two novel pattern classifiers is presented. Scale independent human detection is achieved by processing in the wavelet domain using square wavelet features. These features used to characterise human silhouettes at different scales are similar to rectangular features used in [Viola 2001]. At the detection stage two detectors are combined to improve detection rate. The first detector is based on shape-outline of humans extracted from the scene using a reduced complexity outline extraction algorithm. A Shape mismatch measure is used to differentiate between the human and the background class. The second detector uses rectangular features as primitives for silhouette description in the wavelet domain. The marginal distribution of features collocated at a particular position on a candidate human (a patch of the image) is used to describe statistically the silhouette. Two similarity measures are computed between a candidate human and the model histograms of human and non human classes. The similarity measure is used to discriminate between the human and the non human class. At the tracking stage, a tracker based on joint probabilistic data association filter (JPDAF) for data association, and motion correspondence is presented. Track clustering is used to reduce hypothesis enumeration complexity. Towards improving response time with increase in frame dimension, scene complexity, and number of channels; a scalable algorithmic architecture and operating accuracy prediction technique is presented. A scheduling strategy for improving the response time and throughput by parallel processing is also presented
- …