93 research outputs found

    Weighted Low-Rank Approximation of Matrices and Background Modeling

    Get PDF
    We primarily study a special a weighted low-rank approximation of matrices and then apply it to solve the background modeling problem. We propose two algorithms for this purpose: one operates in the batch mode on the entire data and the other one operates in the batch-incremental mode on the data and naturally captures more background variations and computationally more effective. Moreover, we propose a robust technique that learns the background frame indices from the data and does not require any training frames. We demonstrate through extensive experiments that by inserting a simple weight in the Frobenius norm, it can be made robust to the outliers similar to the â„“1\ell_1 norm. Our methods match or outperform several state-of-the-art online and batch background modeling methods in virtually all quantitative and qualitative measures.Comment: arXiv admin note: text overlap with arXiv:1707.0028

    Overcomplete Dictionary and Deep Learning Approaches to Image and Video Analysis

    Get PDF
    Extracting useful information while ignoring others (e.g. noise, occlusion, lighting) is an essential and challenging data analyzing step for many computer vision tasks such as facial recognition, scene reconstruction, event detection, image restoration, etc. Data analyzing of those tasks can be formulated as a form of matrix decomposition or factorization to separate useful and/or fill in missing information based on sparsity and/or low-rankness of the data. There has been an increasing number of non-convex approaches including conventional matrix norm optimizing and emerging deep learning models. However, it is hard to optimize the ideal l0-norm or learn the deep models directly and efficiently. Motivated from this challenging process, this thesis proposes two sets of approaches: conventional and deep learning based. For conventional approaches, this thesis proposes a novel online non-convex lp-norm based Robust PCA (OLP-RPCA) approach for matrix decomposition, where 0 < p < 1. OLP-RPCA is developed from the offline version LP-RPCA. A robust face recognition framework is also developed from Robust PCA and sparse coding approaches. More importantly, OLP-RPCA method can achieve real-time performance on large-scale data without parallelizing or implementing on a graphics processing unit. We mathematically and empirically show that our OLP-RPCA algorithm is linear in both the sample dimension and the number of samples. The proposed OLP-RPCA and LP-RPCA approaches are evaluated in various applications including Gaussian/non-Gaussian image denoising, face modeling, real-time background subtraction and video inpainting and compared against numerous state-of-the-art methods to demonstrate the robustness of the algorithms. In addition, this thesis proposes a novel Robust lp-norm Singular Value Decomposition (RP-SVD) method for analyzing two-way functional data. The proposed RP-SVD is formulated as an lp-norm based penalized loss minimization problem. The proposed RP-SVD method is evaluated in four applications, i.e. noise and outlier removal, estimation of missing values, structure from motion reconstruction and facial image reconstruction. For deep learning based approaches, this thesis explores the idea of matrix decomposition via Robust Deep Boltzmann Machines (RDBM), an alternative form of Robust Boltzmann Machines, which aiming at dealing with noise and occlusion for face-related applications, particularly. This thesis proposes an extension to texture modeling in the Deep Appearance Models (DAMs) by using RDBM to enhance its robustness against noise and occlusion. The extended model can cope with occlusion and extreme poses when modeling human faces in 2D image reconstruction. This thesis also introduces new fitting algorithms with occlusion awareness through the mask obtained from the RDBM reconstruction. The proposed approach is evaluated in various applications by using challenging face datasets, i.e. Labeled Face Parts in the Wild (LFPW), Helen, EURECOM and AR databases, to demonstrate its robustness and capabilities

    Aerial detection of ground moving objects

    Get PDF
    Automatic detection of ground moving objects (GMOs) from aerial camera platforms (ACPs) is essential in many video processing applications, both civilian and military. However, the extremely small size of GMOs and the continuous shaky motion of ACPs present challenges in detecting GMOs for traditional methods. In particular, existing detection methods fail to balance high detection accuracy and real-time performance. This thesis investigates the problem of GMOs detection from ACPs and overcoming the challenges and drawbacks that exist in traditional detection methods. The underlying assumption used in this thesis is based on principal component pursuits (PCP) in which the background of an aerial video is modelled as a low-rank matrix and the moving objects are modelled as sparse corrupting this video. The research in this thesis investigates the proposed problem in three directions: (1) handling the shaky motion in ACPs robustly with minimal computational cost, (2) improving the detection accuracy and radically lowering false detections via penalization term, and (3) extending PCP’s formulation to achieve adequate real-time performance. In this thesis, a series of novel algorithms are proposed to show the evolution of our research towards the development of KR-LNSP, a novel robust detection method which is characterized by high detection accuracy, low computational cost, adaptability to shaky motion in ACPs, and adequate real-time performance. Each of the proposed algorithms is intensively evaluated using different challenging datasets and compared with current state-of-the-art methods

    VIDEO FOREGROUND LOCALIZATION FROM TRADITIONAL METHODS TO DEEP LEARNING

    Get PDF
    These days, detection of Visual Attention Regions (VAR), such as moving objects has become an integral part of many Computer Vision applications, viz. pattern recognition, object detection and classification, video surveillance, autonomous driving, human-machine interaction (HMI), and so forth. The moving object identification using bounding boxes has matured to the level of localizing the objects along their rigid borders and the process is called foreground localization (FGL). Over the decades, many image segmentation methodologies have been well studied, devised, and extended to suit the video FGL. Despite that, still, the problem of video foreground (FG) segmentation remains an intriguing task yet appealing due to its ill-posed nature and myriad of applications. Maintaining spatial and temporal coherence, particularly at object boundaries, persists challenging, and computationally burdensome. It even gets harder when the background possesses dynamic nature, like swaying tree branches or shimmering water body, and illumination variations, shadows cast by the moving objects, or when the video sequences have jittery frames caused by vibrating or unstable camera mounts on a surveillance post or moving robot. At the same time, in the analysis of traffic flow or human activity, the performance of an intelligent system substantially depends on its robustness of localizing the VAR, i.e., the FG. To this end, the natural question arises as what is the best way to deal with these challenges? Thus, the goal of this thesis is to investigate plausible real-time performant implementations from traditional approaches to modern-day deep learning (DL) models for FGL that can be applicable to many video content-aware applications (VCAA). It focuses mainly on improving existing methodologies through harnessing multimodal spatial and temporal cues for a delineated FGL. The first part of the dissertation is dedicated for enhancing conventional sample-based and Gaussian mixture model (GMM)-based video FGL using probability mass function (PMF), temporal median filtering, and fusing CIEDE2000 color similarity, color distortion, and illumination measures, and picking an appropriate adaptive threshold to extract the FG pixels. The subjective and objective evaluations are done to show the improvements over a number of similar conventional methods. The second part of the thesis focuses on exploiting and improving deep convolutional neural networks (DCNN) for the problem as mentioned earlier. Consequently, three models akin to encoder-decoder (EnDec) network are implemented with various innovative strategies to improve the quality of the FG segmentation. The strategies are not limited to double encoding - slow decoding feature learning, multi-view receptive field feature fusion, and incorporating spatiotemporal cues through long-shortterm memory (LSTM) units both in the subsampling and upsampling subnetworks. Experimental studies are carried out thoroughly on all conditions from baselines to challenging video sequences to prove the effectiveness of the proposed DCNNs. The analysis demonstrates that the architectural efficiency over other methods while quantitative and qualitative experiments show the competitive performance of the proposed models compared to the state-of-the-art

    Super-resolution:A comprehensive survey

    Get PDF

    Mobile Robots Navigation

    Get PDF
    Mobile robots navigation includes different interrelated activities: (i) perception, as obtaining and interpreting sensory information; (ii) exploration, as the strategy that guides the robot to select the next direction to go; (iii) mapping, involving the construction of a spatial representation by using the sensory information perceived; (iv) localization, as the strategy to estimate the robot position within the spatial map; (v) path planning, as the strategy to find a path towards a goal location being optimal or not; and (vi) path execution, where motor actions are determined and adapted to environmental changes. The book addresses those activities by integrating results from the research work of several authors all over the world. Research cases are documented in 32 chapters organized within 7 categories next described
    • …
    corecore