22 research outputs found

    Crowd counting and segmentation in visual surveillance

    Get PDF
    Reference no. MP-PD.8In this paper, the crowd counting and segmentation problem is formulated as a maximum a posterior problem, in which 3D human shape models are designed and matched with image evidence provided by foreground/background separation and probability of boundary. The solution is obtained by considering only the human candidates that are possible to be un-occluded in each iteration, and then applying on them a validation and rejection strategy based on minimum description length. The merit of the proposed optimization procedure is that its computational cost is much smaller than that of the global optimization methods while its performance is comparable to them. The approach is shown to be robust with respect to severe partial occlusions. ©2009 IEEE.published_or_final_versionThe 16th IEEE International Conference on Image Processing (ICIP 2009), Cairo, Egypt, 7-10 November 2009. In International Conference on Image Processing Proceedings, 2009, p. 2573-257

    Bayesian 3D model based human detection in crowded scenes using efficient optimization

    Get PDF
    In this paper, we solve the problem of human detection in crowded scenes using a Bayesian 3D model based method. Human candidates are first nominated by a head detector and a foot detector, then optimization is performed to find the best configuration of the candidates and their corresponding shape models. The solution is obtained by decomposing the mutually related candidates into un-occluded ones and occluded ones in each iteration, and then performing model matching for the un-occluded candidates. To this end, in addition to some obvious clues, we also derive a graph that depicts the inter-object relation so that unreasonable decomposition is avoided. The merit of the proposed optimization procedure is that its computational cost is similar to the greedy optimization methods while its performance is comparable to the global optimization approaches. For model matching, it is performed by employing both prior knowledge and image likelihood, where the priors include the distribution of individual shape models and the restriction on the inter-object distance in real world, and image likelihood is provided by foreground extraction and the edge information. After the model matching, a validation and rejection strategy based on minimum description length is applied to confirm the candidates that have reliable matching results. The proposed method is tested on both the publicly available Caviar dataset and a challenging dataset constructed by ourselves. The experimental results demonstrate the effectiveness of our approach. © 2010 IEEE.published_or_final_versionThe 2011 IEEE Workshop on Applications of Computer Vision (WACV 2011), Kona, HI., 5-7 January 2011. In Proceedings of WACV2011, 2011, p. 557-56

    Pedestrian detection for mobile bus surveillance

    Full text link
    In this paper, we present a system for pedestrian detection involving scenes captured by mobile bus surveillance cameras in busy city streets. Our approach integrates scene localization, foreground and background separation, and pedestrian detection modules into a unified detection framework. The scene localization module performs a two stage clustering of the video data. In the first stage, SIFT Homography is applied to cluster frames in terms of their structural similarities and second stage further clusters these aligned frames in terms of lighting. This produces clusters of images which are differential in viewpoint and lighting. A kernel density estimation (KDE) method for colour and gradient foreground-background separation are then used to construct background model for each image cluster which is subsequently used to detect all foreground pixels. Finally, using a hierarchical template matching approach, pedestrians can be identified. We have tested our system on a set of real bus video datasets and the experimental results verify that our system works well in practice.<br /

    Passenger monitoring in moving bus video

    Full text link
    In this paper, we present a novel person detection system for public transport buses tackling the problem of changing illumination conditions. Our approach integrates a stable SIFT (Scale Invariant Feature Transform) background seat modeling mechanism with a human shape model into a weighted Bayesian framework to detect passengers on-board buses. SIFT background modeling extracts local stable features on the pre-annotated background seat areas and tracks these features over time to build a global statistical background model for each seat. Since SIFT features are partially invariant to lighting, this background model can be used robustly to detect the seat occupancy status even under severe lighting changes. The human shape model further confirms the existence of a passenger when a seat is occupied. This constructs a robust passenger monitoring system which is resilient to illumination changes. We evaluate the performance of our proposed system on a number of challenging video datasets obtained from bus cameras and the experimental results show that it is superior to state-of-art people detection systems.<br /

    Moving objects segmentation at a traffic junction from vehicular vision

    Get PDF
    Automatic extraction/segmentation and the recognition of moving objects on a road environment is often problematic. This is especially the case when cameras are mounted on a moving vehicle (for vehicular vision), yet this remains a critical task in vision based safety transportation. The essential problem is twofold: extracting the foreground from the moving background, and separating and recognizing pedestrians from other moving objects such as cars that appear in the foreground. The challenge of our proposed technique is to use a single mobile camera for separating the foreground from the background, and to recognize pedestrians and other objects from vehicular vision in order to achieve a low cost and intelligent driver assistance system. In this paper, the normal distribution is employed for modelling pixel gray values. The proposed technique separates the foreground from the background by comparing the pixel gray values of an input image with the normal distribution model of the pixel. The model is renewed after the separation to give a new background model for the next image. The renewal strategy changes depending on if the concerned pixel is in the background or on the foreground. Performance of the present technique was examined by real world vehicle videos captured at a junction when a car turns left or right and satisfactory results were obtained

    Three-dimensional model-based human detection in crowded scenes

    Get PDF
    In this paper, the problem of human detection in crowded scenes is formulated as a maximum a posteriori problem, in which, given a set of candidates, predefined 3-D human shape models are matched with image evidence, provided by foreground extraction and probability of boundary, to estimate the human configuration. The optimal solution is obtained by decomposing the mutually related candidates into unoccluded and occluded ones in each iteration according to a graph description of the candidate relations and then only matching models for the unoccluded candidates. A candidate validation and rejection process based on minimum description length and local occlusion reasoning is carried out after each iteration of model matching. The advantage of the proposed optimization procedure is that its computational cost is much smaller than that of global optimization methods, while its performance is comparable to them. The proposed method achieves a detection rate of about 2% higher on a subset of images of the Caviar data set than the best result reported by previous works. We also demonstrate the performance of the proposed method using another challenging data set. © 2011 IEEE.published_or_final_versio

    Space Object Detection in Video Satellite Images Using Motion Information

    Get PDF
    Compared to ground-based observation, space-based observation is an effective approach to catalog and monitor increasing space objects. In this paper, space object detection in a video satellite image with star image background is studied. A new detection algorithm using motion information is proposed, which includes not only the known satellite attitude motion information but also the unknown object motion information. The effect of satellite attitude motion on an image is analyzed quantitatively, which can be decomposed into translation and rotation. Considering the continuity of object motion and brightness change, variable thresholding based on local image properties and detection of the previous frame is used to segment a single-frame image. Then, the algorithm uses the correlation of object motion in multiframe and satellite attitude motion information to detect the object. Experimental results with a video image from the Tiantuo-2 satellite show that this algorithm provides a good way for space object detection

    Transferring a generic pedestrian detector towards specific scenes.

    Get PDF
    近年來,在公開的大規模人工標注數據集上訓練通用行人檢測器的方法有了顯著的進步。然而,當通用行人檢測器被應用到一個特定的,未公開過的場景中時,它的性能會不如預期。這是由待檢測的數據(源樣本)與訓練數據(目標樣本)的不匹配,以及新場景中視角、光照、分辨率和背景噪音的變化擾動造成的。在本論文中,我們提出一個新的自動將通用行人檢測器適應到特定場景中的框架。這個框架分為兩個階段。在第一階段,我們探索監控錄像場景中提供的特定表征。利用這些表征,從目標場景中選擇正負樣本並重新訓練行人檢測器,該過程不斷迭代直至收斂。在第二階段,我們提出一個新的機器學習框架,該框架綜合每個樣本的標簽和比重。根據這些比重,源樣本和目標樣本被重新權重,以優化最終的分類器。這兩種方法都屬於半監督學習,僅僅需要非常少的人工干預。使用提出的方法可以顯著提高通用行人檢測器的准確性。實驗顯示,由方法訓練出來的檢測器可以和使用大量手工標注的目標場景數據訓練出來的媲美。與其它解決類似問題的方法比較,該方法同樣好於許多已有方法。本論文的工作已經分別於朲朱朱年和朲朱朲年在杉杅杅杅計算機視覺和模式識別會議(权杖材杒)中發表。In recent years, significant progress has been made in learning generic pedestrian detectors from publicly available manually labeled large scale training datasets. However, when a generic pedestrian detector is applied to a specific, previously undisclosed scene where the testing data (target examples) does not match with the training data (source examples) because of variations of viewpoints, resolutions, illuminations and backgrounds, its accuracy may decrease greatly.In this thesis, a new framework is proposed automatically adapting a pre-trained generic pedestrian detector to a specific traffic scene. The framework is two-phased. In the first phase, scene-specific cues in the video surveillance sequence are explored. Utilizing the multi-cue information, both condent positive and negative examples from the target scene are selected to re-train the detector iteratively. In the second phase, a new machine learning framework is proposed, incorporating not only example labels but also example confidences. Source and target examples are re-weighted according to their confidence, optimizing the performance of the final classifier. Both methods belong to semi-supervised learning and require very little human intervention.The proposed approaches significantly improve the accuracy of the generic pedestrian detector. Their results are comparable with the detector trained using a large number of manually labeled frames from the target scene. Comparison with other existing approaches tackling similar problems shows that the proposed approaches outperform many contemporary methods.The works have been published on the IEEE Conference on Computer Vision and Pattern Recognition in 2011 and 2012, respectively.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Detailed summary in vernacular field only.Wang, Meng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2012.Includes bibliographical references (leaves 42-45).Abstracts also in Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- PedestrianDetection --- p.1Chapter 1.1.1 --- Overview --- p.1Chapter 1.1.2 --- StatisticalLearning --- p.1Chapter 1.1.3 --- ObjectRepresentation --- p.2Chapter 1.1.4 --- SupervisedStatisticalLearninginObjectDetection --- p.3Chapter 1.2 --- PedestrianDetectioninVideoSurveillance --- p.4Chapter 1.2.1 --- ProblemSetting --- p.4Chapter 1.2.2 --- Challenges --- p.4Chapter 1.2.3 --- MotivationsandContributions --- p.5Chapter 1.3 --- RelatedWork --- p.6Chapter 1.4 --- OrganizationsofChapters --- p.9Chapter 2 --- Label Inferring by Multi-Cues --- p.10Chapter 2.1 --- DataSet --- p.10Chapter 2.2 --- Method --- p.12Chapter 2.2.1 --- CondentPositiveExamplesofPedestrians --- p.13Chapter 2.2.2 --- CondentNegativeExamplesfromtheBackground --- p.17Chapter 2.2.3 --- CondentNegativeExamplesfromVehicles --- p.17Chapter 2.2.4 --- FinalSceneSpecicPedestrianDetector --- p.19Chapter 2.3 --- ExperimentResults --- p.20Chapter 3 --- Transferring a Detector by Condence Propagation --- p.24Chapter 3.1 --- Method --- p.25Chapter 3.1.1 --- Overview --- p.25Chapter 3.1.2 --- InitialEstimationofCondenceScores --- p.27Chapter 3.1.3 --- Re-weightingSourceSamples --- p.27Chapter 3.1.4 --- Condence-EncodedSVM --- p.30Chapter 3.2 --- Experiments --- p.33Chapter 3.2.1 --- Datasets --- p.33Chapter 3.2.2 --- ParameterSetting --- p.35Chapter 3.2.3 --- Results --- p.36Chapter 4 --- Conclusions and Future Work --- p.4
    corecore