9,529 research outputs found

    Total Variation Regularized Tensor RPCA for Background Subtraction from Compressive Measurements

    Full text link
    Background subtraction has been a fundamental and widely studied task in video analysis, with a wide range of applications in video surveillance, teleconferencing and 3D modeling. Recently, motivated by compressive imaging, background subtraction from compressive measurements (BSCM) is becoming an active research task in video surveillance. In this paper, we propose a novel tensor-based robust PCA (TenRPCA) approach for BSCM by decomposing video frames into backgrounds with spatial-temporal correlations and foregrounds with spatio-temporal continuity in a tensor framework. In this approach, we use 3D total variation (TV) to enhance the spatio-temporal continuity of foregrounds, and Tucker decomposition to model the spatio-temporal correlations of video background. Based on this idea, we design a basic tensor RPCA model over the video frames, dubbed as the holistic TenRPCA model (H-TenRPCA). To characterize the correlations among the groups of similar 3D patches of video background, we further design a patch-group-based tensor RPCA model (PG-TenRPCA) by joint tensor Tucker decompositions of 3D patch groups for modeling the video background. Efficient algorithms using alternating direction method of multipliers (ADMM) are developed to solve the proposed models. Extensive experiments on simulated and real-world videos demonstrate the superiority of the proposed approaches over the existing state-of-the-art approaches.Comment: To appear in IEEE TI

    Spatio-temporal Video Parsing for Abnormality Detection

    Get PDF
    Abnormality detection in video poses particular challenges due to the infinite size of the class of all irregular objects and behaviors. Thus no (or by far not enough) abnormal training samples are available and we need to find abnormalities in test data without actually knowing what they are. Nevertheless, the prevailing concept of the field is to directly search for individual abnormal local patches or image regions independent of another. To address this problem, we propose a method for joint detection of abnormalities in videos by spatio-temporal video parsing. The goal of video parsing is to find a set of indispensable normal spatio-temporal object hypotheses that jointly explain all the foreground of a video, while, at the same time, being supported by normal training samples. Consequently, we avoid a direct detection of abnormalities and discover them indirectly as those hypotheses which are needed for covering the foreground without finding an explanation for themselves by normal samples. Abnormalities are localized by MAP inference in a graphical model and we solve it efficiently by formulating it as a convex optimization problem. We experimentally evaluate our approach on several challenging benchmark sets, improving over the state-of-the-art on all standard benchmarks both in terms of abnormality classification and localization.Comment: 15 pages, 12 figures, 3 table

    An investigation into image and video foreground segmentation and change detection

    Get PDF
    Detecting and segmenting Spatio-temporal foreground objects from videos are significant to motion pattern modelling and video content analysis. Extensive efforts have been made in the past decades. Nevertheless, video-based saliency detection and foreground segmentation remained challenging. On the one hand, the performances of image-based saliency detection algorithms are limited in complex contents, while the temporal connectivity between frames are not well-resolved. On the other hand, compared with the prosperous image-based datasets, the datasets in video-level saliency detection and segmentation usually have smaller scale and less diversity of contents. Towards a better understanding of video-level semantics, this thesis investigates the foreground estimation and segmentation in both image-level and video-level. This thesis firstly demonstrates the effectiveness of traditional features in video foreground estimation and segmentation. Motion patterns obtained by optical flow are utilised to draw coarse estimations about the foreground objects. The coarse estimations are refined by aligning motion boundaries with actual contours of the foreground objects with the participation of HOG descriptor. And a precise segmentation of the foreground is computed based on the refined foreground estimations and video-level colour distribution. Second, a deep convolutional neural network (CNN) for image saliency detection is proposed, which is named HReSNet. To improve the accuracy of saliency prediction, an independent feature refining network is implemented. A Euclidean distance loss is integrated into loss computation to enhance the saliency predictions near the contours of objects. The experimental results demonstrate that our network obtains competitive results compared with the state-of-art algorithms. Third, a large-scale dataset for video saliency detection and foreground segmentation is built to enrich the diversity of current video-based foreground segmentation datasets. A supervised framework is also proposed as the baseline, which integrates our HReSNet, Long-Short Term Memory (LSTM) networks and a hierarchical segmentation network. Forth, in the practice of change detection, there requires distinguishing the expected changes with semantics from the unexpected changes. Therefore, a new CNN design is proposed to detect changes in multi-temporal high-resolution urban images. Experimental results showed our change detection network outperformed the competing algorithms with significant advantages
    corecore