15,892 research outputs found
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
State-of-the-art object detection networks depend on region proposal
algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN
have reduced the running time of these detection networks, exposing region
proposal computation as a bottleneck. In this work, we introduce a Region
Proposal Network (RPN) that shares full-image convolutional features with the
detection network, thus enabling nearly cost-free region proposals. An RPN is a
fully convolutional network that simultaneously predicts object bounds and
objectness scores at each position. The RPN is trained end-to-end to generate
high-quality region proposals, which are used by Fast R-CNN for detection. We
further merge RPN and Fast R-CNN into a single network by sharing their
convolutional features---using the recently popular terminology of neural
networks with 'attention' mechanisms, the RPN component tells the unified
network where to look. For the very deep VGG-16 model, our detection system has
a frame rate of 5fps (including all steps) on a GPU, while achieving
state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS
COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015
competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning
entries in several tracks. Code has been made publicly available.Comment: Extended tech repor
Memory-Efficient Deep Salient Object Segmentation Networks on Gridized Superpixels
Computer vision algorithms with pixel-wise labeling tasks, such as semantic
segmentation and salient object detection, have gone through a significant
accuracy increase with the incorporation of deep learning. Deep segmentation
methods slightly modify and fine-tune pre-trained networks that have hundreds
of millions of parameters. In this work, we question the need to have such
memory demanding networks for the specific task of salient object segmentation.
To this end, we propose a way to learn a memory-efficient network from scratch
by training it only on salient object detection datasets. Our method encodes
images to gridized superpixels that preserve both the object boundaries and the
connectivity rules of regular pixels. This representation allows us to use
convolutional neural networks that operate on regular grids. By using these
encoded images, we train a memory-efficient network using only 0.048\% of the
number of parameters that other deep salient object detection networks have.
Our method shows comparable accuracy with the state-of-the-art deep salient
object detection methods and provides a faster and a much more memory-efficient
alternative to them. Due to its easy deployment, such a network is preferable
for applications in memory limited devices such as mobile phones and IoT
devices.Comment: 6 pages, submitted to MMSP 201
- …