15 research outputs found

    3D Capturing Performances of Low-Cost Range Sensors for Mass-Market Applications

    Get PDF
    Since the advent of the first Kinect as motion controller device for the Microsoft XBOX platform (November 2010), several similar active and low-cost range sensing devices have been introduced on the mass-market for several purposes, including gesture based interfaces, 3D multimedia interaction, robot navigation, finger tracking, 3D body scanning for garment design and proximity sensors for automotive. However, given their capability to generate a real time stream of range images, these has been used in some projects also as general purpose range devices, with performances that for some applications might be satisfying. This paper shows the working principle of the various devices, analyzing them in terms of systematic errors and random errors for exploring the applicability of them in standard 3D capturing problems. Five actual devices have been tested featuring three different technologies: i) Kinect V1 by Microsoft, Structure Sensor by Occipital, and Xtion PRO by ASUS, all based on different implementations of the Primesense sensor; ii) F200 by Intel/Creative, implementing the Realsense pattern projection technology; Kinect V2 by Microsoft, equipped with the Canesta TOF Camera. A critical analysis of the results tries first of all to compare them, and secondarily to focus the range of applications for which such devices could actually work as a viable solution

    A Noise-Aware Coding Scheme for Texture Classification

    Get PDF
    Texture-based analysis of images is a very common and much discussed issue in the fields of computer vision and image processing. Several methods have already been proposed to codify texture micro-patterns (texlets) in images. Most of these methods perform well when a given image is noise-free, but real world images contain different types of signal-independent as well as signal-dependent noises originated from different sources, even from the camera sensor itself. Hence, it is necessary to differentiate false textures appearing due to the noises, and thus, to achieve a reliable representation of texlets. In this proposal, we define an adaptive noise band (ANB) to approximate the amount of noise contamination around a pixel up to a certain extent. Based on this ANB, we generate reliable codes named noise tolerant ternary pattern (NTTP) to represent the texlets in an image. Extensive experiments on several datasets from renowned texture databases, such as the Outex and the Brodatz database, show that NTTP performs much better than the state-of-the-art methods

    Multi-wavelet residual dense convolutional neural network for image denoising

    Full text link
    Networks with large receptive field (RF) have shown advanced fitting ability in recent years. In this work, we utilize the short-term residual learning method to improve the performance and robustness of networks for image denoising tasks. Here, we choose a multi-wavelet convolutional neural network (MWCNN), one of the state-of-art networks with large RF, as the backbone, and insert residual dense blocks (RDBs) in its each layer. We call this scheme multi-wavelet residual dense convolutional neural network (MWRDCNN). Compared with other RDB-based networks, it can extract more features of the object from adjacent layers, preserve the large RF, and boost the computing efficiency. Meanwhile, this approach also provides a possibility of absorbing advantages of multiple architectures in a single network without conflicts. The performance of the proposed method has been demonstrated in extensive experiments with a comparison with existing techniques.Comment: 9 pages, 9 figure

    A Collaborative Adaptive Wiener Filter for Image Restoration Using a Spatial-Domain Multi-patch Correlation Model

    Get PDF
    We present a new patch-based image restoration algorithm using an adaptive Wiener filter (AWF) with a novel spatial-domain multi-patch correlation model. The new filter structure is referred to as a collaborative adaptive Wiener filter (CAWF). The CAWF employs a finite size moving window. At each position, the current observation window represents the reference patch. We identify the most similar patches in the image within a given search window about the reference patch. A single-stage weighted sum of all of the pixels in the similar patches is used to estimate the center pixel in the reference patch. The weights are based on a new multi-patch correlation model that takes into account each pixel’s spatial distance to the center of its corresponding patch, as well as the intensity vector distances among the similar patches. One key advantage of the CAWF approach, compared with many other patch-based algorithms, is that it can jointly handle blur and noise. Furthermore, it can also readily treat spatially varying signal and noise statistics. To the best of our knowledge, this is the first multi-patch algorithm to use a single spatial-domain weighted sum of all pixels within multiple similar patches to form its estimate and the first to use a spatial-domain multi-patch correlation model to determine the weights. The experimental results presented show that the proposed method delivers high performance in image restoration in a variety of scenarios

    Video modeling via implicit motion representations

    Get PDF
    Video modeling refers to the development of analytical representations for explaining the intensity distribution in video signals. Based on the analytical representation, we can develop algorithms for accomplishing particular video-related tasks. Therefore video modeling provides us a foundation to bridge video data and related-tasks. Although there are many video models proposed in the past decades, the rise of new applications calls for more efficient and accurate video modeling approaches.;Most existing video modeling approaches are based on explicit motion representations, where motion information is explicitly expressed by correspondence-based representations (i.e., motion velocity or displacement). Although it is conceptually simple, the limitations of those representations and the suboptimum of motion estimation techniques can degrade such video modeling approaches, especially for handling complex motion or non-ideal observation video data. In this thesis, we propose to investigate video modeling without explicit motion representation. Motion information is implicitly embedded into the spatio-temporal dependency among pixels or patches instead of being explicitly described by motion vectors.;Firstly, we propose a parametric model based on a spatio-temporal adaptive localized learning (STALL). We formulate video modeling as a linear regression problem, in which motion information is embedded within the regression coefficients. The coefficients are adaptively learned within a local space-time window based on LMMSE criterion. Incorporating a spatio-temporal resampling and a Bayesian fusion scheme, we can enhance the modeling capability of STALL on more general videos. Under the framework of STALL, we can develop video processing algorithms for a variety of applications by adjusting model parameters (i.e., the size and topology of model support and training window). We apply STALL on three video processing problems. The simulation results show that motion information can be efficiently exploited by our implicit motion representation and the resampling and fusion do help to enhance the modeling capability of STALL.;Secondly, we propose a nonparametric video modeling approach, which is not dependent on explicit motion estimation. Assuming the video sequence is composed of many overlapping space-time patches, we propose to embed motion-related information into the relationships among video patches and develop a generic sparsity-based prior for typical video sequences. First, we extend block matching to more general kNN-based patch clustering, which provides an implicit and distributed representation for motion information. We propose to enforce the sparsity constraint on a higher-dimensional data array signal, which is generated by packing the patches in the similar patch set. Then we solve the inference problem by updating the kNN array and the wanted signal iteratively. Finally, we present a Bayesian fusion approach to fuse multiple-hypothesis inferences. Simulation results in video error concealment, denoising, and deartifacting are reported to demonstrate its modeling capability.;Finally, we summarize the proposed two video modeling approaches. We also point out the perspectives of implicit motion representations in applications ranging from low to high level problems

    Image Restoration Using Space-Variant Gaussian Scale Mixtures in Overcomplete Pyramids

    Full text link

    Permutation recovery in shuffled total least squares regression

    Full text link
    Shuffled linear regression concerns itself with linear models with an unknown correspondence between the input and the output. This correspondence is usually represented by a permutation matrix II*. The model we are interested in has one more complication which is that the design matrix is itself latent and is observed with noise. This is considered as a type of errors-in-variables (EIV) model. Our interest lies in the recovery of the permutation matrix. We propose an estimator for II* based on the total least squares (TLS) technique, a common method of estimation used in EIV model. The estimation problem can be viewed as approximating one matrix by another of lower rank and the quantity it seeks to minimize is the sum of the smallest singular values squared. Due to identifiability issue, we evaluate the proposed estimator by the normalized Procrustes quadratic loss which allows for an orthogonal rotation of the estimated design matrix. Our main result provides an upper bound on this quantity which states that it is required that the signal-to-noise ratio to go to infinity in order for the loss to go to zero. On the computational front, since the problem of permutation recovery is NP-hard to solve, we propose a simple and efficient algorithm named alternating LAP/TLS algorithm (ALTA) to approximate the estimator, and we use it to empirically examine the main result. The main idea of the algorithm is to alternate between estimating the unknown coefficient matrix using the TLS method and estimating the latent permutation matrix by solving a linear assignment problem (LAP) which runs in polynomial time. Lastly, we propose a hypothesis testing procedure based on graph matching which we apply in the field of digital humanities, on character social networks constructed from novel series
    corecore