299,658 research outputs found
Practical Full Resolution Learned Lossless Image Compression
We propose the first practical learned lossless image compression system,
L3C, and show that it outperforms the popular engineered codecs, PNG, WebP and
JPEG 2000. At the core of our method is a fully parallelizable hierarchical
probabilistic model for adaptive entropy coding which is optimized end-to-end
for the compression task. In contrast to recent autoregressive discrete
probabilistic models such as PixelCNN, our method i) models the image
distribution jointly with learned auxiliary representations instead of
exclusively modeling the image distribution in RGB space, and ii) only requires
three forward-passes to predict all pixel probabilities instead of one for each
pixel. As a result, L3C obtains over two orders of magnitude speedups when
sampling compared to the fastest PixelCNN variant (Multiscale-PixelCNN).
Furthermore, we find that learning the auxiliary representation is crucial and
outperforms predefined auxiliary representations such as an RGB pyramid
significantly.Comment: Updated preprocessing and Table 1, see A.1 in supplementary. Code and
models: https://github.com/fab-jul/L3C-PyTorc
Lightweight Monocular Depth Estimation Model by Joint End-to-End Filter pruning
Convolutional neural networks (CNNs) have emerged as the state-of-the-art in
multiple vision tasks including depth estimation. However, memory and computing
power requirements remain as challenges to be tackled in these models.
Monocular depth estimation has significant use in robotics and virtual reality
that requires deployment on low-end devices. Training a small model from
scratch results in a significant drop in accuracy and it does not benefit from
pre-trained large models. Motivated by the literature of model pruning, we
propose a lightweight monocular depth model obtained from a large trained
model. This is achieved by removing the least important features with a novel
joint end-to-end filter pruning. We propose to learn a binary mask for each
filter to decide whether to drop the filter or not. These masks are trained
jointly to exploit relations between filters at different layers as well as
redundancy within the same layer. We show that we can achieve around 5x
compression rate with small drop in accuracy on the KITTI driving dataset. We
also show that masking can improve accuracy over the baseline with fewer
parameters, even without enforcing compression loss
- …