Search CORE

681 research outputs found

Spatially Adaptive Computation Time for Residual Networks

Author: Collins Maxwell D.
Figurnov Michael
Huang Jonathan
Salakhutdinov Ruslan
Vetrov Dmitry
Zhang Li
Zhu Yukun
Publication venue
Publication date: 02/07/2017
Field of study

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and image segmentation. We present experimental results showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets. Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and find that they correlate surprisingly well with human eye fixation positions.Comment: CVPR 201

arXiv.org e-Print Archive

Crossref

SegSort: Segmentation by Discriminative Sorting of Segments

Author: Chen Liang-Chieh
Collins Maxwell D.
Hwang Jyh-Jing
Shi Jianbo
Yang Tien-Ju
Yu Stella X.
Zhang Xiao
Publication venue
Publication date: 30/10/2019
Field of study

Almost all existing deep learning approaches for semantic segmentation tackle this task as a pixel-wise classification problem. Yet humans understand a scene not in terms of pixels, but by decomposing it into perceptual groups and structures that are the basic building blocks of recognition. This motivates us to propose an end-to-end pixel-wise metric learning approach that mimics this process. In our approach, the optimal visual representation determines the right segmentation within individual images and associates segments with the same semantic classes across images. The core visual learning problem is therefore to maximize the similarity within segments and minimize the similarity between segments. Given a model trained this way, inference is performed consistently by extracting pixel-wise embeddings and clustering, with the semantic label determined by the majority vote of its nearest neighbors from an annotated set. As a result, we present the SegSort, as a first attempt using deep learning for unsupervised semantic segmentation, achieving

76\%

performance of its supervised counterpart. When supervision is available, SegSort shows consistent improvements over conventional approaches based on pixel-wise softmax training. Additionally, our approach produces more precise boundaries and consistent region predictions. The proposed SegSort further produces an interpretable result, as each choice of label can be easily understood from the retrieved nearest segments.Comment: In ICCV 2019. Webpage & Code: https://jyhjinghwang.github.io/projects/segsort.htm

arXiv.org e-Print Archive

Crossref