36,503 research outputs found
ICNet for Real-Time Semantic Segmentation on High-Resolution Images
We focus on the challenging task of real-time semantic segmentation in this
paper. It finds many practical applications and yet is with fundamental
difficulty of reducing a large portion of computation for pixel-wise label
inference. We propose an image cascade network (ICNet) that incorporates
multi-resolution branches under proper label guidance to address this
challenge. We provide in-depth analysis of our framework and introduce the
cascade feature fusion unit to quickly achieve high-quality segmentation. Our
system yields real-time inference on a single GPU card with decent quality
results evaluated on challenging datasets like Cityscapes, CamVid and
COCO-Stuff.Comment: ECCV 201
Multi-Path Region-Based Convolutional Neural Network for Accurate Detection of Unconstrained "Hard Faces"
Large-scale variations still pose a challenge in unconstrained face
detection. To the best of our knowledge, no current face detection algorithm
can detect a face as large as 800 x 800 pixels while simultaneously detecting
another one as small as 8 x 8 pixels within a single image with equally high
accuracy. We propose a two-stage cascaded face detection framework, Multi-Path
Region-based Convolutional Neural Network (MP-RCNN), that seamlessly combines a
deep neural network with a classic learning strategy, to tackle this challenge.
The first stage is a Multi-Path Region Proposal Network (MP-RPN) that proposes
faces at three different scales. It simultaneously utilizes three parallel
outputs of the convolutional feature maps to predict multi-scale candidate face
regions. The "atrous" convolution trick (convolution with up-sampled filters)
and a newly proposed sampling layer for "hard" examples are embedded in MP-RPN
to further boost its performance. The second stage is a Boosted Forests
classifier, which utilizes deep facial features pooled from inside the
candidate face regions as well as deep contextual features pooled from a larger
region surrounding the candidate face regions. This step is included to further
remove hard negative samples. Experiments show that this approach achieves
state-of-the-art face detection performance on the WIDER FACE dataset "hard"
partition, outperforming the former best result by 9.6% for the Average
Precision.Comment: 11 pages, 7 figures, to be presented at CRV 201
Ithemal: Accurate, Portable and Fast Basic Block Throughput Estimation using Deep Neural Networks
Predicting the number of clock cycles a processor takes to execute a block of
assembly instructions in steady state (the throughput) is important for both
compiler designers and performance engineers. Building an analytical model to
do so is especially complicated in modern x86-64 Complex Instruction Set
Computer (CISC) machines with sophisticated processor microarchitectures in
that it is tedious, error prone, and must be performed from scratch for each
processor generation. In this paper we present Ithemal, the first tool which
learns to predict the throughput of a set of instructions. Ithemal uses a
hierarchical LSTM--based approach to predict throughput based on the opcodes
and operands of instructions in a basic block. We show that Ithemal is more
accurate than state-of-the-art hand-written tools currently used in compiler
backends and static machine code analyzers. In particular, our model has less
than half the error of state-of-the-art analytical models (LLVM's llvm-mca and
Intel's IACA). Ithemal is also able to predict these throughput values just as
fast as the aforementioned tools, and is easily ported across a variety of
processor microarchitectures with minimal developer effort.Comment: Published at 36th International Conference on Machine Learning (ICML)
201
- …