17,733 research outputs found

    Temporal shape super-resolution by intra-frame motion encoding using high-fps structured light

    Full text link
    One of the solutions of depth imaging of moving scene is to project a static pattern on the object and use just a single image for reconstruction. However, if the motion of the object is too fast with respect to the exposure time of the image sensor, patterns on the captured image are blurred and reconstruction fails. In this paper, we impose multiple projection patterns into each single captured image to realize temporal super resolution of the depth image sequences. With our method, multiple patterns are projected onto the object with higher fps than possible with a camera. In this case, the observed pattern varies depending on the depth and motion of the object, so we can extract temporal information of the scene from each single image. The decoding process is realized using a learning-based approach where no geometric calibration is needed. Experiments confirm the effectiveness of our method where sequential shapes are reconstructed from a single image. Both quantitative evaluations and comparisons with recent techniques were also conducted.Comment: 9 pages, Published at the International Conference on Computer Vision (ICCV 2017

    Total Variation Regularized Tensor RPCA for Background Subtraction from Compressive Measurements

    Full text link
    Background subtraction has been a fundamental and widely studied task in video analysis, with a wide range of applications in video surveillance, teleconferencing and 3D modeling. Recently, motivated by compressive imaging, background subtraction from compressive measurements (BSCM) is becoming an active research task in video surveillance. In this paper, we propose a novel tensor-based robust PCA (TenRPCA) approach for BSCM by decomposing video frames into backgrounds with spatial-temporal correlations and foregrounds with spatio-temporal continuity in a tensor framework. In this approach, we use 3D total variation (TV) to enhance the spatio-temporal continuity of foregrounds, and Tucker decomposition to model the spatio-temporal correlations of video background. Based on this idea, we design a basic tensor RPCA model over the video frames, dubbed as the holistic TenRPCA model (H-TenRPCA). To characterize the correlations among the groups of similar 3D patches of video background, we further design a patch-group-based tensor RPCA model (PG-TenRPCA) by joint tensor Tucker decompositions of 3D patch groups for modeling the video background. Efficient algorithms using alternating direction method of multipliers (ADMM) are developed to solve the proposed models. Extensive experiments on simulated and real-world videos demonstrate the superiority of the proposed approaches over the existing state-of-the-art approaches.Comment: To appear in IEEE TI

    Online Video Deblurring via Dynamic Temporal Blending Network

    Full text link
    State-of-the-art video deblurring methods are capable of removing non-uniform blur caused by unwanted camera shake and/or object motion in dynamic scenes. However, most existing methods are based on batch processing and thus need access to all recorded frames, rendering them computationally demanding and time consuming and thus limiting their practical use. In contrast, we propose an online (sequential) video deblurring method based on a spatio-temporal recurrent network that allows for real-time performance. In particular, we introduce a novel architecture which extends the receptive field while keeping the overall size of the network small to enable fast execution. In doing so, our network is able to remove even large blur caused by strong camera shake and/or fast moving objects. Furthermore, we propose a novel network layer that enforces temporal consistency between consecutive frames by dynamic temporal blending which compares and adaptively (at test time) shares features obtained at different time steps. We show the superiority of the proposed method in an extensive experimental evaluation.Comment: 10 page

    3D Object Reconstruction from Hand-Object Interactions

    Full text link
    Recent advances have enabled 3d object reconstruction approaches using a single off-the-shelf RGB-D camera. Although these approaches are successful for a wide range of object classes, they rely on stable and distinctive geometric or texture features. Many objects like mechanical parts, toys, household or decorative articles, however, are textureless and characterized by minimalistic shapes that are simple and symmetric. Existing in-hand scanning systems and 3d reconstruction techniques fail for such symmetric objects in the absence of highly distinctive features. In this work, we show that extracting 3d hand motion for in-hand scanning effectively facilitates the reconstruction of even featureless and highly symmetric objects and we present an approach that fuses the rich additional information of hands into a 3d reconstruction pipeline, significantly contributing to the state-of-the-art of in-hand scanning.Comment: International Conference on Computer Vision (ICCV) 2015, http://files.is.tue.mpg.de/dtzionas/In-Hand-Scannin

    DA-RNN: Semantic Mapping with Data Associated Recurrent Neural Networks

    Full text link
    3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of the scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recurrent neural network architecture for semantic labeling on RGB-D videos. The output of the network is integrated with mapping techniques such as KinectFusion in order to inject semantic information into the reconstructed 3D scene. Experiments conducted on a real world dataset and a synthetic dataset with RGB-D videos demonstrate the ability of our method in semantic 3D scene mapping.Comment: Published in RSS 201

    Temporal phase unwrapping using deep learning

    Full text link
    The multi-frequency temporal phase unwrapping (MF-TPU) method, as a classical phase unwrapping algorithm for fringe projection profilometry (FPP), is capable of eliminating the phase ambiguities even in the presence of surface discontinuities or spatially isolated objects. For the simplest and most efficient case, two sets of 3-step phase-shifting fringe patterns are used: the high-frequency one is for 3D measurement and the unit-frequency one is for unwrapping the phase obtained from the high-frequency pattern set. The final measurement precision or sensitivity is determined by the number of fringes used within the high-frequency pattern, under the precondition that the phase can be successfully unwrapped without triggering the fringe order error. Consequently, in order to guarantee a reasonable unwrapping success rate, the fringe number (or period number) of the high-frequency fringe patterns is generally restricted to about 16, resulting in limited measurement accuracy. On the other hand, using additional intermediate sets of fringe patterns can unwrap the phase with higher frequency, but at the expense of a prolonged pattern sequence. Inspired by recent successes of deep learning techniques for computer vision and computational imaging, in this work, we report that the deep neural networks can learn to perform TPU after appropriate training, as called deep-learning based temporal phase unwrapping (DL-TPU), which can substantially improve the unwrapping reliability compared with MF-TPU even in the presence of different types of error sources, e.g., intensity noise, low fringe modulation, and projector nonlinearity. We further experimentally demonstrate for the first time, to our knowledge, that the high-frequency phase obtained from 64-period 3-step phase-shifting fringe patterns can be directly and reliably unwrapped from one unit-frequency phase using DL-TPU

    Improved method for phase wraps reduction in profilometry

    Full text link
    In order to completely eliminate, or greatly reduce the number of phase wraps in 2D wrapped phase map, Gdeisat et al. proposed an algorithm, which uses shifting the spectrum towards the origin. But the spectrum can be shifted only by an integer number, meaning that the phase wraps reduction is often not optimal. In addition, Gdeisat's method will take much time to make the Fourier transform, inverse Fourier transform, select and shift the spectral components. In view of the above problems, we proposed an improved method for phase wraps elimination or reduction. First, the wrapped phase map is padded with zeros, the carrier frequency of the projected fringe is determined by high resolution, which can be used as the moving distance of the spectrum. And then realize frequency shift in spatial domain. So it not only can enable the spectrum to be shifted by a rational number when the carrier frequency is not an integer number, but also reduce the execution time. Finally, the experimental results demonstrated that the proposed method is feasible.Comment: 16 pages, 15 figures, 1 table. arXiv admin note: text overlap with arXiv:1604.0723
    corecore