6,155 research outputs found
Extended Bit-Plane Compression for Convolutional Neural Network Accelerators
After the tremendous success of convolutional neural networks in image
classification, object detection, speech recognition, etc., there is now rising
demand for deployment of these compute-intensive ML models on tightly power
constrained embedded and mobile systems at low cost as well as for pushing the
throughput in data centers. This has triggered a wave of research towards
specialized hardware accelerators. Their performance is often constrained by
I/O bandwidth and the energy consumption is dominated by I/O transfers to
off-chip memory. We introduce and evaluate a novel, hardware-friendly
compression scheme for the feature maps present within convolutional neural
networks. We show that an average compression ratio of 4.4x relative to
uncompressed data and a gain of 60% over existing method can be achieved for
ResNet-34 with a compression block requiring <300 bit of sequential cells and
minimal combinational logic
Multiscale Adaptive Representation of Signals: I. The Basic Framework
We introduce a framework for designing multi-scale, adaptive, shift-invariant
frames and bi-frames for representing signals. The new framework, called
AdaFrame, improves over dictionary learning-based techniques in terms of
computational efficiency at inference time. It improves classical multi-scale
basis such as wavelet frames in terms of coding efficiency. It provides an
attractive alternative to dictionary learning-based techniques for low level
signal processing tasks, such as compression and denoising, as well as high
level tasks, such as feature extraction for object recognition. Connections
with deep convolutional networks are also discussed. In particular, the
proposed framework reveals a drawback in the commonly used approach for
visualizing the activations of the intermediate layers in convolutional
networks, and suggests a natural alternative
Compressed Online Dictionary Learning for Fast fMRI Decomposition
We present a method for fast resting-state fMRI spatial decomposi-tions of
very large datasets, based on the reduction of the temporal dimension before
applying dictionary learning on concatenated individual records from groups of
subjects. Introducing a measure of correspondence between spatial
decompositions of rest fMRI, we demonstrates that time-reduced dictionary
learning produces result as reliable as non-reduced decompositions. We also
show that this reduction significantly improves computational scalability
Learning quadrangulated patches for 3D shape parameterization and completion
We propose a novel 3D shape parameterization by surface patches, that are
oriented by 3D mesh quadrangulation of the shape. By encoding 3D surface detail
on local patches, we learn a patch dictionary that identifies principal surface
features of the shape. Unlike previous methods, we are able to encode surface
patches of variable size as determined by the user. We propose novel methods
for dictionary learning and patch reconstruction based on the query of a noisy
input patch with holes. We evaluate the patch dictionary towards various
applications in 3D shape inpainting, denoising and compression. Our method is
able to predict missing vertices and inpaint moderately sized holes. We
demonstrate a complete pipeline for reconstructing the 3D mesh from the patch
encoding. We validate our shape parameterization and reconstruction methods on
both synthetic shapes and real world scans. We show that our patch dictionary
performs successful shape completion of complicated surface textures.Comment: To be presented at International Conference on 3D Vision 2017, 201
Learning sparse representations of depth
This paper introduces a new method for learning and inferring sparse
representations of depth (disparity) maps. The proposed algorithm relaxes the
usual assumption of the stationary noise model in sparse coding. This enables
learning from data corrupted with spatially varying noise or uncertainty,
typically obtained by laser range scanners or structured light depth cameras.
Sparse representations are learned from the Middlebury database disparity maps
and then exploited in a two-layer graphical model for inferring depth from
stereo, by including a sparsity prior on the learned features. Since they
capture higher-order dependencies in the depth structure, these priors can
complement smoothness priors commonly used in depth inference based on Markov
Random Field (MRF) models. Inference on the proposed graph is achieved using an
alternating iterative optimization technique, where the first layer is solved
using an existing MRF-based stereo matching algorithm, then held fixed as the
second layer is solved using the proposed non-stationary sparse coding
algorithm. This leads to a general method for improving solutions of state of
the art MRF-based depth estimation algorithms. Our experimental results first
show that depth inference using learned representations leads to state of the
art denoising of depth maps obtained from laser range scanners and a time of
flight camera. Furthermore, we show that adding sparse priors improves the
results of two depth estimation methods: the classical graph cut algorithm by
Boykov et al. and the more recent algorithm of Woodford et al.Comment: 12 page
A sparsity-driven approach to multi-camera tracking in visual sensor networks
In this paper, a sparsity-driven approach is presented for multi-camera tracking in visual sensor networks (VSNs). VSNs consist of image sensors, embedded processors and wireless transceivers which are powered by batteries. Since the energy and bandwidth resources are limited, setting up a tracking system in VSNs is a challenging problem. Motivated by the goal of tracking in a bandwidth-constrained environment, we present a sparsity-driven method to compress the features extracted by the camera nodes, which are then transmitted across the network for distributed inference. We have designed special overcomplete dictionaries that match the structure of the features, leading to very parsimonious yet accurate representations. We have tested our method in indoor and outdoor people tracking scenarios. Our experimental results demonstrate how our approach leads to communication savings without significant loss in tracking performance
DeepMatching: Hierarchical Deformable Dense Matching
We introduce a novel matching algorithm, called DeepMatching, to compute
dense correspondences between images. DeepMatching relies on a hierarchical,
multi-layer, correlational architecture designed for matching images and was
inspired by deep convolutional approaches. The proposed matching algorithm can
handle non-rigid deformations and repetitive textures and efficiently
determines dense correspondences in the presence of significant changes between
images. We evaluate the performance of DeepMatching, in comparison with
state-of-the-art matching algorithms, on the Mikolajczyk (Mikolajczyk et al
2005), the MPI-Sintel (Butler et al 2012) and the Kitti (Geiger et al 2013)
datasets. DeepMatching outperforms the state-of-the-art algorithms and shows
excellent results in particular for repetitive textures.We also propose a
method for estimating optical flow, called DeepFlow, by integrating
DeepMatching in the large displacement optical flow (LDOF) approach of Brox and
Malik (2011). Compared to existing matching algorithms, additional robustness
to large displacements and complex motion is obtained thanks to our matching
approach. DeepFlow obtains competitive performance on public benchmarks for
optical flow estimation
- …