443 research outputs found
Relation Networks for Object Detection
Although it is well believed for years that modeling relations between
objects would help object recognition, there has not been evidence that the
idea is working in the deep learning era. All state-of-the-art object detection
systems still rely on recognizing object instances individually, without
exploiting their relations during learning.
This work proposes an object relation module. It processes a set of objects
simultaneously through interaction between their appearance feature and
geometry, thus allowing modeling of their relations. It is lightweight and
in-place. It does not require additional supervision and is easy to embed in
existing networks. It is shown effective on improving object recognition and
duplicate removal steps in the modern object detection pipeline. It verifies
the efficacy of modeling object relations in CNN based detection. It gives rise
to the first fully end-to-end object detector
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
State-of-the-art object detection networks depend on region proposal
algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN
have reduced the running time of these detection networks, exposing region
proposal computation as a bottleneck. In this work, we introduce a Region
Proposal Network (RPN) that shares full-image convolutional features with the
detection network, thus enabling nearly cost-free region proposals. An RPN is a
fully convolutional network that simultaneously predicts object bounds and
objectness scores at each position. The RPN is trained end-to-end to generate
high-quality region proposals, which are used by Fast R-CNN for detection. We
further merge RPN and Fast R-CNN into a single network by sharing their
convolutional features---using the recently popular terminology of neural
networks with 'attention' mechanisms, the RPN component tells the unified
network where to look. For the very deep VGG-16 model, our detection system has
a frame rate of 5fps (including all steps) on a GPU, while achieving
state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS
COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015
competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning
entries in several tracks. Code has been made publicly available.Comment: Extended tech repor
An Empirical Evaluation of Current Convolutional Architectures' Ability to Manage Nuisance Location and Scale Variability
We conduct an empirical study to test the ability of Convolutional Neural
Networks (CNNs) to reduce the effects of nuisance transformations of the input
data, such as location, scale and aspect ratio. We isolate factors by adopting
a common convolutional architecture either deployed globally on the image to
compute class posterior distributions, or restricted locally to compute class
conditional distributions given location, scale and aspect ratios of bounding
boxes determined by proposal heuristics. In theory, averaging the latter should
yield inferior performance compared to proper marginalization. Yet empirical
evidence suggests the converse, leading us to conclude that - at the current
level of complexity of convolutional architectures and scale of the data sets
used to train them - CNNs are not very effective at marginalizing nuisance
variability. We also quantify the effects of context on the overall
classification task and its impact on the performance of CNNs, and propose
improved sampling techniques for heuristic proposal schemes that improve
end-to-end performance to state-of-the-art levels. We test our hypothesis on a
classification task using the ImageNet Challenge benchmark and on a
wide-baseline matching task using the Oxford and Fischer's datasets.Comment: 10 pages, 5 figures, 3 tables -- CVPR 2016, camera-ready versio
Planet Hunters X: Searching for Nearby Neighbors of 75 Planet and Eclipsing Binary Candidates from the K2 Kepler Extended Mission
We present high-resolution observations of a sample of 75 K2 targets from
Campaigns 1-3 using speckle interferometry on the Southern Astrophysical
Research (SOAR) telescope and adaptive optics (AO) imaging at the Keck II
telescope. The median SOAR -band and Keck -band detection limits at 1"
were ~mag and ~mag, respectively. This
sample includes 37 stars likely to host planets, 32 targets likely to be
eclipsing binaries (EBs), and 6 other targets previously labeled as likely
planetary false positives. We find nine likely physically bound companion stars
within 3" of three candidate transiting exoplanet host stars and six likely
EBs. Six of the nine detected companions are new discoveries; one of the six,
EPIC 206061524, is associated with a planet candidate. Among the EB candidates,
companions were only found near the shortest period ones ( days), which is
in line with previous results showing high multiplicity near short-period
binary stars. This high-resolution data, including both the detected companions
and the limits on potential unseen companions, will be useful in future planet
vetting and stellar multiplicity rate studies for planets and binaries.Comment: Accepted in A
Recurrent Pixel Embedding for Instance Grouping
We introduce a differentiable, end-to-end trainable framework for solving
pixel-level grouping problems such as instance segmentation consisting of two
novel components. First, we regress pixels into a hyper-spherical embedding
space so that pixels from the same group have high cosine similarity while
those from different groups have similarity below a specified margin. We
analyze the choice of embedding dimension and margin, relating them to
theoretical results on the problem of distributing points uniformly on the
sphere. Second, to group instances, we utilize a variant of mean-shift
clustering, implemented as a recurrent neural network parameterized by kernel
bandwidth. This recurrent grouping module is differentiable, enjoys convergent
dynamics and probabilistic interpretability. Backpropagating the group-weighted
loss through this module allows learning to focus on only correcting embedding
errors that won't be resolved during subsequent clustering. Our framework,
while conceptually simple and theoretically abundant, is also practically
effective and computationally efficient. We demonstrate substantial
improvements over state-of-the-art instance segmentation for object proposal
generation, as well as demonstrating the benefits of grouping loss for
classification tasks such as boundary detection and semantic segmentation
Attend Refine Repeat: Active Box Proposal Generation via In-Out Localization
The problem of computing category agnostic bounding box proposals is utilized
as a core component in many computer vision tasks and thus has lately attracted
a lot of attention. In this work we propose a new approach to tackle this
problem that is based on an active strategy for generating box proposals that
starts from a set of seed boxes, which are uniformly distributed on the image,
and then progressively moves its attention on the promising image areas where
it is more likely to discover well localized bounding box proposals. We call
our approach AttractioNet and a core component of it is a CNN-based category
agnostic object location refinement module that is capable of yielding accurate
and robust bounding box predictions regardless of the object category.
We extensively evaluate our AttractioNet approach on several image datasets
(i.e. COCO, PASCAL, ImageNet detection and NYU-Depth V2 datasets) reporting on
all of them state-of-the-art results that surpass the previous work in the
field by a significant margin and also providing strong empirical evidence that
our approach is capable to generalize to unseen categories. Furthermore, we
evaluate our AttractioNet proposals in the context of the object detection task
using a VGG16-Net based detector and the achieved detection performance on COCO
manages to significantly surpass all other VGG16-Net based detectors while even
being competitive with a heavily tuned ResNet-101 based detector. Code as well
as box proposals computed for several datasets are available at::
https://github.com/gidariss/AttractioNet.Comment: Technical report. Code as well as box proposals computed for several
datasets are available at:: https://github.com/gidariss/AttractioNe
- …