1,233 research outputs found

    Confidence Propagation through CNNs for Guided Sparse Depth Regression

    Full text link
    Generally, convolutional neural networks (CNNs) process data on a regular grid, e.g. data generated by ordinary cameras. Designing CNNs for sparse and irregularly spaced input data is still an open research problem with numerous applications in autonomous driving, robotics, and surveillance. In this paper, we propose an algebraically-constrained normalized convolution layer for CNNs with highly sparse input that has a smaller number of network parameters compared to related work. We propose novel strategies for determining the confidence from the convolution operation and propagating it to consecutive layers. We also propose an objective function that simultaneously minimizes the data error while maximizing the output confidence. To integrate structural information, we also investigate fusion strategies to combine depth and RGB information in our normalized convolution network framework. In addition, we introduce the use of output confidence as an auxiliary information to improve the results. The capabilities of our normalized convolution network framework are demonstrated for the problem of scene depth completion. Comprehensive experiments are performed on the KITTI-Depth and the NYU-Depth-v2 datasets. The results clearly demonstrate that the proposed approach achieves superior performance while requiring only about 1-5% of the number of parameters compared to the state-of-the-art methods.Comment: 14 pages, 14 Figure

    Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

    Get PDF
    The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist1

    Compressive sensing for 3D microwave imaging systems

    Get PDF
    Compressed sensing (CS) image reconstruction techniques are developed and experimentally implemented for wideband microwave synthetic aperture radar (SAR) imaging systems with applications to nondestructive testing and evaluation. These techniques significantly reduce the number of spatial measurement points and, consequently, the acquisition time by sampling at a level lower than the Nyquist-Shannon rate. Benefiting from a reduced number of samples, this work successfully implemented two scanning procedures: the nonuniform raster and the optimum path. Three CS reconstruction approaches are also proposed for the wideband microwave SAR-based imaging systems. The first approach reconstructs a full-set of raw data from undersampled measurements via L1-norm optimization and consequently applies 3D forward SAR on the reconstructed raw data. The second proposed approach employs forward SAR and reverse SAR (R-SAR) transforms in each L1-norm optimization iteration reconstructing images directly. This dissertation proposes a simple, elegant truncation repair method to combat the truncation error which is a critical obstacle to the convergence of the CS iterative algorithm. The third proposed CS reconstruction algorithm is the adaptive basis selection (ABS) compressed sensing. Rather than a fixed sparsifying basis, the proposed ABS method adaptively selects the best basis from a set of bases in each iteration of the L1-norm optimization according to a proposed decision metric that is derived from the sparsity of the image and the coherence between the measurement and sparsifying matrices. The results of several experiments indicate that the proposed algorithms recover 2D and 3D SAR images with only 20% of the spatial points and reduce the acquisition time by up to 66% of that of conventional methods while maintaining or improving the quality of the SAR images --Abstract, page iv

    Adapting Computer Vision Models To Limitations On Input Dimensionality And Model Complexity

    Get PDF
    When considering instances of distributed systems where visual sensors communicate with remote predictive models, data traffic is limited to the capacity of communication channels, and hardware limits the processing of collected data prior to transmission. We study novel methods of adapting visual inference to limitations on complexity and data availability at test time, wherever the aforementioned limitations exist. Our contributions detailed in this thesis consider both task-specific and task-generic approaches to reducing the data requirement for inference, and evaluate our proposed methods on a wide range of computer vision tasks. This thesis makes four distinct contributions: (i) We investigate multi-class action classification via two-stream convolutional neural networks that directly ingest information extracted from compressed video bitstreams. We show that selective access to macroblock motion vector information provides a good low-dimensional approximation of the underlying optical flow in visual sequences. (ii) We devise a bitstream cropping method by which AVC/H.264 and H.265 bitstreams are reduced to the minimum amount of necessary elements for optical flow extraction, while maintaining compliance with codec standards. We additionally study the effect of codec rate-quality control on the sparsity and noise incurred on optical flow derived from resulting bitstreams, and do so for multiple coding standards. (iii) We demonstrate degrees of variability in the amount of data required for action classification, and leverage this to reduce the dimensionality of input volumes by inferring the required temporal extent for accurate classification prior to processing via learnable machines. (iv) We extend the Mixtures-of-Experts (MoE) paradigm to adapt the data cost of inference for any set of constituent experts. We postulate that the minimum acceptable data cost of inference varies for different input space partitions, and consider mixtures where each expert is designed to meet a different set of constraints on input dimensionality. To take advantage of the flexibility of such mixtures in processing different input representations and modalities, we train biased gating functions such that experts requiring less information to make their inferences are favoured to others. We finally note that, our proposed data utility optimization solutions include a learnable component which considers specified priorities on the amount of information to be used prior to inference, and can be realized for any combination of tasks, modalities, and constraints on available data

    Performance Evaluation of Multivariate Interpolation Methods for Scattered Data in Geoscience Applications

    Get PDF
    • …
    corecore