1,002 research outputs found

    Video modeling via implicit motion representations

    Get PDF
    Video modeling refers to the development of analytical representations for explaining the intensity distribution in video signals. Based on the analytical representation, we can develop algorithms for accomplishing particular video-related tasks. Therefore video modeling provides us a foundation to bridge video data and related-tasks. Although there are many video models proposed in the past decades, the rise of new applications calls for more efficient and accurate video modeling approaches.;Most existing video modeling approaches are based on explicit motion representations, where motion information is explicitly expressed by correspondence-based representations (i.e., motion velocity or displacement). Although it is conceptually simple, the limitations of those representations and the suboptimum of motion estimation techniques can degrade such video modeling approaches, especially for handling complex motion or non-ideal observation video data. In this thesis, we propose to investigate video modeling without explicit motion representation. Motion information is implicitly embedded into the spatio-temporal dependency among pixels or patches instead of being explicitly described by motion vectors.;Firstly, we propose a parametric model based on a spatio-temporal adaptive localized learning (STALL). We formulate video modeling as a linear regression problem, in which motion information is embedded within the regression coefficients. The coefficients are adaptively learned within a local space-time window based on LMMSE criterion. Incorporating a spatio-temporal resampling and a Bayesian fusion scheme, we can enhance the modeling capability of STALL on more general videos. Under the framework of STALL, we can develop video processing algorithms for a variety of applications by adjusting model parameters (i.e., the size and topology of model support and training window). We apply STALL on three video processing problems. The simulation results show that motion information can be efficiently exploited by our implicit motion representation and the resampling and fusion do help to enhance the modeling capability of STALL.;Secondly, we propose a nonparametric video modeling approach, which is not dependent on explicit motion estimation. Assuming the video sequence is composed of many overlapping space-time patches, we propose to embed motion-related information into the relationships among video patches and develop a generic sparsity-based prior for typical video sequences. First, we extend block matching to more general kNN-based patch clustering, which provides an implicit and distributed representation for motion information. We propose to enforce the sparsity constraint on a higher-dimensional data array signal, which is generated by packing the patches in the similar patch set. Then we solve the inference problem by updating the kNN array and the wanted signal iteratively. Finally, we present a Bayesian fusion approach to fuse multiple-hypothesis inferences. Simulation results in video error concealment, denoising, and deartifacting are reported to demonstrate its modeling capability.;Finally, we summarize the proposed two video modeling approaches. We also point out the perspectives of implicit motion representations in applications ranging from low to high level problems

    BRUISE DETECTION IN APPLES USING 3D INFRARED IMAGING AND MACHINE LEARNING TECHNOLOGIES

    Get PDF
    Bruise detection plays an important role in fruit grading. A bruise detection system capable of finding and removing damaged products on the production lines will distinctly improve the quality of fruits for sale, and consequently improve the fruit economy. This dissertation presents a novel automatic detection system based on surface information obtained from 3D near-infrared imaging technique for bruised apple identification. The proposed 3D bruise detection system is expected to provide better performance in bruise detection than the existing 2D systems. We first propose a mesh denoising filter to reduce noise effect while preserving the geometric features of the meshes. Compared with several existing mesh denoising filters, the proposed filter achieves better performance in reducing noise effect as well as preserving bruised regions in 3D meshes of bruised apples. Next, we investigate two different machine learning techniques for the identification of bruised apples. The first technique is to extract hand-crafted feature from 3D meshes, and train a predictive classifier based on hand-crafted features. It is shown that the predictive model trained on the proposed hand-crafted features outperforms the same models trained on several other local shape descriptors. The second technique is to apply deep learning to learn the feature representation automatically from the mesh data, and then use the deep learning model or a new predictive model for the classification. The optimized deep learning model achieves very high classification accuracy, and it outperforms the performance of the detection system based on the proposed hand-crafted features. At last, we investigate GPU techniques for accelerating the proposed apple bruise detection system. Specifically, the dissertation proposes a GPU framework, implemented in CUDA, for the acceleration of the algorithm that extracts vertex-based local binary patterns. Experimental results show that the proposed GPU program speeds up the process of extracting local binary patterns by 5 times compared to a single-core CPU program

    Quadtree Structured Approximation Algorithms

    Get PDF
    The success of many image restoration algorithms is often due to their ability to sparsely describe the original signal. Many sparse promoting transforms exist, including wavelets, the so called ‘lets’ family of transforms and more recent non-local learned transforms. The first part of this thesis reviews sparse approximation theory, particularly in relation to 2-D piecewise polynomial signals. We also show the connection between this theory and current state of the art algorithms that cover the following image restoration and enhancement applications: denoising, deconvolution, interpolation and multi-view super resolution. In [63], Shukla et al. proposed a compression algorithm, based on a sparse quadtree decomposition model, which could optimally represent piecewise polynomial images. In the second part of this thesis we adapt this model to image restoration by changing the rate-distortion penalty to a description-length penalty. Moreover, one of the major drawbacks of this type of approximation is the computational complexity required to find a suitable subspace for each node of the quadtree. We address this issue by searching for a suitable subspace much more efficiently using the mathematics of updating matrix factorisations. Novel algorithms are developed to tackle the four problems previously mentioned. Simulation results indicate that we beat state of the art results when the original signal is in the model (e.g. depth images) and are competitive for natural images when the degradation is high.Open Acces

    Characterization of palmprints by wavelet signatures via directional context modeling

    Get PDF
    2003-2004 > Academic research: refereed > Publication in refereed journalVersion of RecordPublishe
    corecore