Search CORE

4 research outputs found

Learning to Structure an Image with Few Colors

Author: Gould Stephen
Hou Yunzhong
Zheng Liang
Publication venue
Publication date: 17/03/2020
Field of study

Color and structure are the two pillars that construct an image. Usually, the structure is well expressed through a rich spectrum of colors, allowing objects in an image to be recognized by neural networks. However, under extreme limitations of color space, the structure tends to vanish, and thus a neural network might fail to understand the image. Interested in exploring this interplay between color and structure, we study the scientific problem of identifying and preserving the most informative image structures while constraining the color space to just a few bits, such that the resulting image can be recognized with possibly high accuracy. To this end, we propose a color quantization network, ColorCNN, which learns to structure the images from the classification loss in an end-to-end manner. Given a color space size, ColorCNN quantizes colors in the original image by generating a color index map and an RGB color palette. Then, this color-quantized image is fed to a pre-trained task network to evaluate its performance. In our experiment, with only a 1-bit color space (i.e., two colors), the proposed network achieves 82.1% top-1 accuracy on the CIFAR10 dataset, outperforming traditional color quantization methods by a large margin. For applications, when encoded with PNG, the proposed color quantization shows superiority over other image compression methods in the extremely low bit-rate regime. The code is available at: https://github.com/hou-yz/color_distillation

arXiv.org e-Print Archive

V2VNet: Vehicle-to-Vehicle Communication for Joint Perception and Prediction

Author: Liang Ming
Manivasagam Sivabalan
Tu James
Urtasun Raquel
Wang Tsun-Hsuan
Yang Bin
Zeng Wenyuan
Publication venue
Publication date: 17/08/2020
Field of study

In this paper, we explore the use of vehicle-to-vehicle (V2V) communication to improve the perception and motion forecasting performance of self-driving vehicles. By intelligently aggregating the information received from multiple nearby vehicles, we can observe the same scene from different viewpoints. This allows us to see through occlusions and detect actors at long range, where the observations are very sparse or non-existent. We also show that our approach of sending compressed deep feature map activations achieves high accuracy while satisfying communication bandwidth requirements.Comment: ECCV 2020 (Oral

arXiv.org e-Print Archive

Deep Multi-Task Learning for Joint Localization, Perception, and Prediction

Author: Bârsan Ioan Andrei
Casas Sergio
Martinez Julieta
Phillips John
Sadat Abbas
Urtasun Raquel
Publication venue
Publication date: 10/04/2021
Field of study

Over the last few years, we have witnessed tremendous progress on many subtasks of autonomous driving, including perception, motion forecasting, and motion planning. However, these systems often assume that the car is accurately localized against a high-definition map. In this paper we question this assumption, and investigate the issues that arise in state-of-the-art autonomy stacks under localization error. Based on our observations, we design a system that jointly performs perception, prediction, and localization. Our architecture is able to reuse computation between both tasks, and is thus able to correct localization errors efficiently. We show experiments on a large-scale autonomy dataset, demonstrating the efficiency and accuracy of our proposed approach.Comment: CVPR 2

arXiv.org e-Print Archive

Learning to Localize Through Compressed Binary Maps

Author: Bârsan Ioan Andrei
Martinez Julieta
Urtasun Raquel
Wang Shenlong
Wei Xinkai
Publication venue
Publication date: 20/12/2020
Field of study

One of the main difficulties of scaling current localization systems to large environments is the on-board storage required for the maps. In this paper we propose to learn to compress the map representation such that it is optimal for the localization task. As a consequence, higher compression rates can be achieved without loss of localization accuracy when compared to standard coding schemes that optimize for reconstruction, thus ignoring the end task. Our experiments show that it is possible to learn a task-specific compression which reduces storage requirements by two orders of magnitude over general-purpose codecs such as WebP without sacrificing performance.Comment: 18 pages, 12 figures, 6 tables; Presented at CVPR 201

arXiv.org e-Print Archive