This paper proposes a deep learning architecture based on Residual Network
that dynamically adjusts the number of executed layers for the regions of the
image. This architecture is end-to-end trainable, deterministic and
problem-agnostic. It is therefore applicable without any modifications to a
wide range of computer vision problems such as image classification, object
detection and image segmentation. We present experimental results showing that
this model improves the computational efficiency of Residual Networks on the
challenging ImageNet classification and COCO object detection datasets.
Additionally, we evaluate the computation time maps on the visual saliency
dataset cat2000 and find that they correlate surprisingly well with human eye
fixation positions.Comment: CVPR 201