6,155 research outputs found

    Extended Bit-Plane Compression for Convolutional Neural Network Accelerators

    Get PDF
    After the tremendous success of convolutional neural networks in image classification, object detection, speech recognition, etc., there is now rising demand for deployment of these compute-intensive ML models on tightly power constrained embedded and mobile systems at low cost as well as for pushing the throughput in data centers. This has triggered a wave of research towards specialized hardware accelerators. Their performance is often constrained by I/O bandwidth and the energy consumption is dominated by I/O transfers to off-chip memory. We introduce and evaluate a novel, hardware-friendly compression scheme for the feature maps present within convolutional neural networks. We show that an average compression ratio of 4.4x relative to uncompressed data and a gain of 60% over existing method can be achieved for ResNet-34 with a compression block requiring <300 bit of sequential cells and minimal combinational logic

    Multiscale Adaptive Representation of Signals: I. The Basic Framework

    Full text link
    We introduce a framework for designing multi-scale, adaptive, shift-invariant frames and bi-frames for representing signals. The new framework, called AdaFrame, improves over dictionary learning-based techniques in terms of computational efficiency at inference time. It improves classical multi-scale basis such as wavelet frames in terms of coding efficiency. It provides an attractive alternative to dictionary learning-based techniques for low level signal processing tasks, such as compression and denoising, as well as high level tasks, such as feature extraction for object recognition. Connections with deep convolutional networks are also discussed. In particular, the proposed framework reveals a drawback in the commonly used approach for visualizing the activations of the intermediate layers in convolutional networks, and suggests a natural alternative

    Compressed Online Dictionary Learning for Fast fMRI Decomposition

    Get PDF
    We present a method for fast resting-state fMRI spatial decomposi-tions of very large datasets, based on the reduction of the temporal dimension before applying dictionary learning on concatenated individual records from groups of subjects. Introducing a measure of correspondence between spatial decompositions of rest fMRI, we demonstrates that time-reduced dictionary learning produces result as reliable as non-reduced decompositions. We also show that this reduction significantly improves computational scalability

    Learning quadrangulated patches for 3D shape parameterization and completion

    Full text link
    We propose a novel 3D shape parameterization by surface patches, that are oriented by 3D mesh quadrangulation of the shape. By encoding 3D surface detail on local patches, we learn a patch dictionary that identifies principal surface features of the shape. Unlike previous methods, we are able to encode surface patches of variable size as determined by the user. We propose novel methods for dictionary learning and patch reconstruction based on the query of a noisy input patch with holes. We evaluate the patch dictionary towards various applications in 3D shape inpainting, denoising and compression. Our method is able to predict missing vertices and inpaint moderately sized holes. We demonstrate a complete pipeline for reconstructing the 3D mesh from the patch encoding. We validate our shape parameterization and reconstruction methods on both synthetic shapes and real world scans. We show that our patch dictionary performs successful shape completion of complicated surface textures.Comment: To be presented at International Conference on 3D Vision 2017, 201

    Learning sparse representations of depth

    Full text link
    This paper introduces a new method for learning and inferring sparse representations of depth (disparity) maps. The proposed algorithm relaxes the usual assumption of the stationary noise model in sparse coding. This enables learning from data corrupted with spatially varying noise or uncertainty, typically obtained by laser range scanners or structured light depth cameras. Sparse representations are learned from the Middlebury database disparity maps and then exploited in a two-layer graphical model for inferring depth from stereo, by including a sparsity prior on the learned features. Since they capture higher-order dependencies in the depth structure, these priors can complement smoothness priors commonly used in depth inference based on Markov Random Field (MRF) models. Inference on the proposed graph is achieved using an alternating iterative optimization technique, where the first layer is solved using an existing MRF-based stereo matching algorithm, then held fixed as the second layer is solved using the proposed non-stationary sparse coding algorithm. This leads to a general method for improving solutions of state of the art MRF-based depth estimation algorithms. Our experimental results first show that depth inference using learned representations leads to state of the art denoising of depth maps obtained from laser range scanners and a time of flight camera. Furthermore, we show that adding sparse priors improves the results of two depth estimation methods: the classical graph cut algorithm by Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page

    A sparsity-driven approach to multi-camera tracking in visual sensor networks

    Get PDF
    In this paper, a sparsity-driven approach is presented for multi-camera tracking in visual sensor networks (VSNs). VSNs consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. Motivated by the goal of tracking in a bandwidth-constrained environment, we present a sparsity-driven method to compress the features extracted by the camera nodes, which are then transmitted across the network for distributed inference. We have designed special overcomplete dictionaries that match the structure of the features, leading to very parsimonious yet accurate representations. We have tested our method in indoor and outdoor people tracking scenarios. Our experimental results demonstrate how our approach leads to communication savings without significant loss in tracking performance

    DeepMatching: Hierarchical Deformable Dense Matching

    Get PDF
    We introduce a novel matching algorithm, called DeepMatching, to compute dense correspondences between images. DeepMatching relies on a hierarchical, multi-layer, correlational architecture designed for matching images and was inspired by deep convolutional approaches. The proposed matching algorithm can handle non-rigid deformations and repetitive textures and efficiently determines dense correspondences in the presence of significant changes between images. We evaluate the performance of DeepMatching, in comparison with state-of-the-art matching algorithms, on the Mikolajczyk (Mikolajczyk et al 2005), the MPI-Sintel (Butler et al 2012) and the Kitti (Geiger et al 2013) datasets. DeepMatching outperforms the state-of-the-art algorithms and shows excellent results in particular for repetitive textures.We also propose a method for estimating optical flow, called DeepFlow, by integrating DeepMatching in the large displacement optical flow (LDOF) approach of Brox and Malik (2011). Compared to existing matching algorithms, additional robustness to large displacements and complex motion is obtained thanks to our matching approach. DeepFlow obtains competitive performance on public benchmarks for optical flow estimation
    corecore