4 research outputs found
Learning to Structure an Image with Few Colors
Color and structure are the two pillars that construct an image. Usually, the
structure is well expressed through a rich spectrum of colors, allowing objects
in an image to be recognized by neural networks. However, under extreme
limitations of color space, the structure tends to vanish, and thus a neural
network might fail to understand the image. Interested in exploring this
interplay between color and structure, we study the scientific problem of
identifying and preserving the most informative image structures while
constraining the color space to just a few bits, such that the resulting image
can be recognized with possibly high accuracy. To this end, we propose a color
quantization network, ColorCNN, which learns to structure the images from the
classification loss in an end-to-end manner. Given a color space size, ColorCNN
quantizes colors in the original image by generating a color index map and an
RGB color palette. Then, this color-quantized image is fed to a pre-trained
task network to evaluate its performance. In our experiment, with only a 1-bit
color space (i.e., two colors), the proposed network achieves 82.1% top-1
accuracy on the CIFAR10 dataset, outperforming traditional color quantization
methods by a large margin. For applications, when encoded with PNG, the
proposed color quantization shows superiority over other image compression
methods in the extremely low bit-rate regime. The code is available at:
https://github.com/hou-yz/color_distillation
V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction
In this paper, we explore the use of vehicle-to-vehicle (V2V) communication
to improve the perception and motion forecasting performance of self-driving
vehicles. By intelligently aggregating the information received from multiple
nearby vehicles, we can observe the same scene from different viewpoints. This
allows us to see through occlusions and detect actors at long range, where the
observations are very sparse or non-existent. We also show that our approach of
sending compressed deep feature map activations achieves high accuracy while
satisfying communication bandwidth requirements.Comment: ECCV 2020 (Oral
Deep Multi-Task Learning for Joint Localization, Perception, and Prediction
Over the last few years, we have witnessed tremendous progress on many
subtasks of autonomous driving, including perception, motion forecasting, and
motion planning. However, these systems often assume that the car is accurately
localized against a high-definition map. In this paper we question this
assumption, and investigate the issues that arise in state-of-the-art autonomy
stacks under localization error. Based on our observations, we design a system
that jointly performs perception, prediction, and localization. Our
architecture is able to reuse computation between both tasks, and is thus able
to correct localization errors efficiently. We show experiments on a
large-scale autonomy dataset, demonstrating the efficiency and accuracy of our
proposed approach.Comment: CVPR 2
Learning to Localize Through Compressed Binary Maps
One of the main difficulties of scaling current localization systems to large
environments is the on-board storage required for the maps. In this paper we
propose to learn to compress the map representation such that it is optimal for
the localization task. As a consequence, higher compression rates can be
achieved without loss of localization accuracy when compared to standard coding
schemes that optimize for reconstruction, thus ignoring the end task. Our
experiments show that it is possible to learn a task-specific compression which
reduces storage requirements by two orders of magnitude over general-purpose
codecs such as WebP without sacrificing performance.Comment: 18 pages, 12 figures, 6 tables; Presented at CVPR 201