359 research outputs found
End-to-end Learning of Driving Models from Large-scale Video Datasets
Robust perception-action models should be learned from training data with
diverse visual appearances and realistic behaviors, yet current approaches to
deep visuomotor policy learning have been generally limited to in-situ models
learned from a single vehicle or a simulation environment. We advocate learning
a generic vehicle motion model from large scale crowd-sourced video data, and
develop an end-to-end trainable architecture for learning to predict a
distribution over future vehicle egomotion from instantaneous monocular camera
observations and previous vehicle state. Our model incorporates a novel
FCN-LSTM architecture, which can be learned from large-scale crowd-sourced
vehicle action data, and leverages available scene segmentation side tasks to
improve performance under a privileged learning paradigm.Comment: camera ready for CVPR201
SkipNet: Learning Dynamic Routing in Convolutional Networks
While deeper convolutional networks are needed to achieve maximum accuracy in
visual perception tasks, for many inputs shallower networks are sufficient. We
exploit this observation by learning to skip convolutional layers on a
per-input basis. We introduce SkipNet, a modified residual network, that uses a
gating network to selectively skip convolutional blocks based on the
activations of the previous layer. We formulate the dynamic skipping problem in
the context of sequential decision making and propose a hybrid learning
algorithm that combines supervised learning and reinforcement learning to
address the challenges of non-differentiable skipping decisions. We show
SkipNet reduces computation by 30-90% while preserving the accuracy of the
original model on four benchmark datasets and outperforms the state-of-the-art
dynamic networks and static compression methods. We also qualitatively evaluate
the gating policy to reveal a relationship between image scale and saliency and
the number of layers skipped.Comment: ECCV 2018 Camera ready version. Code is available at
https://github.com/ucbdrive/skipne
Multi-Content GAN for Few-Shot Font Style Transfer
In this work, we focus on the challenge of taking partial observations of
highly-stylized text and generalizing the observations to generate unobserved
glyphs in the ornamented typeface. To generate a set of multi-content images
following a consistent style from very few examples, we propose an end-to-end
stacked conditional GAN model considering content along channels and style
along network layers. Our proposed network transfers the style of given glyphs
to the contents of unseen ones, capturing highly stylized fonts found in the
real-world such as those on movie posters or infographics. We seek to transfer
both the typographic stylization (ex. serifs and ears) as well as the textual
stylization (ex. color gradients and effects.) We base our experiments on our
collected data set including 10,000 fonts with different styles and demonstrate
effective generalization from a very small number of observed glyphs
- …