155 research outputs found

    An Extended Occlusion Detection Approach for Video Processing

    Get PDF
    Occlusions become conspicuous as failure regions in video processing when unified over time because the contraventions of the restriction of brightness have accumulated and evolved in occluded regions. The accuracy at the boundaries of the moving objects is one of the challenging areas that required further exploration and research. This paper presents the work in process approach that can detect occlusion regions by using pixel-wise coherence, segment-wise confidence and interpolation technique. Our method can get the same result as usual methods by solving only one Partial Differential Equations (PDE) problem; it is superior to existing methods because it is faster and provides better coverage rates for occlusion regions than variation techniques when tested against a varied number of benchmark datasets. With these improved results, we can apply and extend our approach to a wider range of applications in computer vision, such as background subtraction, tracking, 3D reconstruction, video surveillance, video compression

    Does Time Smoothen Space? Implications for Space-Time Representation

    Get PDF
    The continuous nature of space and time is a fundamental tenet of many scientific endeavors. That digital representation imposes granularity is well recognized, but whether it is possible to address space completely remains unanswered. This paper argues Hales' proof of Kepler's conjecture on the packing of hard spheres suggests the answer to be "no", providing examples of why this matters in GIS generally and considering implications for spatio-temporal GIS in particular. It seeks to resolve the dichotomy between continuous and granular space by showing how a continuous space may be emergent over a random graph. However, the projection of this latent space into 3D/4D imposes granularity. Perhaps surprisingly, representing space and time as locally conjugate may be key to addressing a "smooth" spatial continuum. This insight leads to the suggestion of Face Centered Cubic Packing as a space-time topology but also raises further questions for spatio-temporal representation

    Recognizing complex faces and gaits via novel probabilistic models

    Get PDF
    In the field of computer vision, developing automated systems to recognize people under unconstrained scenarios is a partially solved problem. In unconstrained sce- narios a number of common variations and complexities such as occlusion, illumi- nation, cluttered background and so on impose vast uncertainty to the recognition process. Among the various biometrics that have been emerging recently, this dissertation focus on two of them namely face and gait recognition. Firstly we address the problem of recognizing faces with major occlusions amidst other variations such as pose, scale, expression and illumination using a novel PRObabilistic Component based Interpretation Model (PROCIM) inspired by key psychophysical principles that are closely related to reasoning under uncertainty. The model basically employs Bayesian Networks to establish, learn, interpret and exploit intrinsic similarity mappings from the face domain. Then, by incorporating e cient inference strategies, robust decisions are made for successfully recognizing faces under uncertainty. PROCIM reports improved recognition rates over recent approaches. Secondly we address the newly upcoming gait recognition problem and show that PROCIM can be easily adapted to the gait domain as well. We scienti cally de ne and formulate sub-gaits and propose a novel modular training scheme to e ciently learn subtle sub-gait characteristics from the gait domain. Our results show that the proposed model is robust to several uncertainties and yields sig- ni cant recognition performance. Apart from PROCIM, nally we show how a simple component based gait reasoning can be coherently modeled using the re- cently prominent Markov Logic Networks (MLNs) by intuitively fusing imaging, logic and graphs. We have discovered that face and gait domains exhibit interesting similarity map- pings between object entities and their components. We have proposed intuitive probabilistic methods to model these mappings to perform recognition under vari- ous uncertainty elements. Extensive experimental validations justi es the robust- ness of the proposed methods over the state-of-the-art techniques.

    Feature-aware uniform tessellations on video manifold for content-sensitive supervoxels

    Get PDF
    Over-segmenting a video into supervoxels has strong potential to reduce the complexity of computer vision applications. Content-sensitive supervoxels (CSS) are typically smaller in content-dense regionsand larger in content-sparse regions. In this paper, we propose to compute feature-aware CSS (FCSS) that are regularly shaped 3D primitive volumes well aligned with local object/region/motion boundaries in video.To compute FCSS, we map a video to a 3-dimensional manifold, in which the volume elements of video manifold give a good measure of the video content density. Then any uniform tessellation on manifold can induce CSS. Our idea is that among all possible uniform tessellations, FCSS find one whose cell boundaries well align with local video boundaries. To achieve this goal, we propose a novel tessellation method that simultaneously minimizes the tessellation energy and maximizes the average boundary distance.Theoretically our method has an optimal competitive ratio O(1). We also present a simple extension of FCSS to streaming FCSS for processing long videos that cannot be loaded into main memory at once. We evaluate FCSS, streaming FCSS and ten representative supervoxel methods on four video datasets and two novel video applications. The results show that our method simultaneously achieves state-of-the-art performance with respect to various evaluation criteria

    Structural Material Property Tailoring Using Deep Neural Networks

    Full text link
    Advances in robotics, artificial intelligence, and machine learning are ushering in a new age of automation, as machines match or outperform human performance. Machine intelligence can enable businesses to improve performance by reducing errors, improving sensitivity, quality and speed, and in some cases achieving outcomes that go beyond current resource capabilities. Relevant applications include new product architecture design, rapid material characterization, and life-cycle management tied with a digital strategy that will enable efficient development of products from cradle to grave. In addition, there are also challenges to overcome that must be addressed through a major, sustained research effort that is based solidly on both inferential and computational principles applied to design tailoring of functionally optimized structures. Current applications of structural materials in the aerospace industry demand the highest quality control of material microstructure, especially for advanced rotational turbomachinery in aircraft engines in order to have the best tailored material property. In this paper, deep convolutional neural networks were developed to accurately predict processing-structure-property relations from materials microstructures images, surpassing current best practices and modeling efforts. The models automatically learn critical features, without the need for manual specification and/or subjective and expensive image analysis. Further, in combination with generative deep learning models, a framework is proposed to enable rapid material design space exploration and property identification and optimization. The implementation must take account of real-time decision cycles and the trade-offs between speed and accuracy

    Video modeling via implicit motion representations

    Get PDF
    Video modeling refers to the development of analytical representations for explaining the intensity distribution in video signals. Based on the analytical representation, we can develop algorithms for accomplishing particular video-related tasks. Therefore video modeling provides us a foundation to bridge video data and related-tasks. Although there are many video models proposed in the past decades, the rise of new applications calls for more efficient and accurate video modeling approaches.;Most existing video modeling approaches are based on explicit motion representations, where motion information is explicitly expressed by correspondence-based representations (i.e., motion velocity or displacement). Although it is conceptually simple, the limitations of those representations and the suboptimum of motion estimation techniques can degrade such video modeling approaches, especially for handling complex motion or non-ideal observation video data. In this thesis, we propose to investigate video modeling without explicit motion representation. Motion information is implicitly embedded into the spatio-temporal dependency among pixels or patches instead of being explicitly described by motion vectors.;Firstly, we propose a parametric model based on a spatio-temporal adaptive localized learning (STALL). We formulate video modeling as a linear regression problem, in which motion information is embedded within the regression coefficients. The coefficients are adaptively learned within a local space-time window based on LMMSE criterion. Incorporating a spatio-temporal resampling and a Bayesian fusion scheme, we can enhance the modeling capability of STALL on more general videos. Under the framework of STALL, we can develop video processing algorithms for a variety of applications by adjusting model parameters (i.e., the size and topology of model support and training window). We apply STALL on three video processing problems. The simulation results show that motion information can be efficiently exploited by our implicit motion representation and the resampling and fusion do help to enhance the modeling capability of STALL.;Secondly, we propose a nonparametric video modeling approach, which is not dependent on explicit motion estimation. Assuming the video sequence is composed of many overlapping space-time patches, we propose to embed motion-related information into the relationships among video patches and develop a generic sparsity-based prior for typical video sequences. First, we extend block matching to more general kNN-based patch clustering, which provides an implicit and distributed representation for motion information. We propose to enforce the sparsity constraint on a higher-dimensional data array signal, which is generated by packing the patches in the similar patch set. Then we solve the inference problem by updating the kNN array and the wanted signal iteratively. Finally, we present a Bayesian fusion approach to fuse multiple-hypothesis inferences. Simulation results in video error concealment, denoising, and deartifacting are reported to demonstrate its modeling capability.;Finally, we summarize the proposed two video modeling approaches. We also point out the perspectives of implicit motion representations in applications ranging from low to high level problems
    • …
    corecore