31,069 research outputs found
Learning Robust Object Recognition Using Composed Scenes from Generative Models
Recurrent feedback connections in the mammalian visual system have been
hypothesized to play a role in synthesizing input in the theoretical framework
of analysis by synthesis. The comparison of internally synthesized
representation with that of the input provides a validation mechanism during
perceptual inference and learning. Inspired by these ideas, we proposed that
the synthesis machinery can compose new, unobserved images by imagination to
train the network itself so as to increase the robustness of the system in
novel scenarios. As a proof of concept, we investigated whether images composed
by imagination could help an object recognition system to deal with occlusion,
which is challenging for the current state-of-the-art deep convolutional neural
networks. We fine-tuned a network on images containing objects in various
occlusion scenarios, that are imagined or self-generated through a deep
generator network. Trained on imagined occluded scenarios under the object
persistence constraint, our network discovered more subtle and localized image
features that were neglected by the original network for object classification,
obtaining better separability of different object classes in the feature space.
This leads to significant improvement of object recognition under occlusion for
our network relative to the original network trained only on un-occluded
images. In addition to providing practical benefits in object recognition under
occlusion, this work demonstrates the use of self-generated composition of
visual scenes through the synthesis loop, combined with the object persistence
constraint, can provide opportunities for neural networks to discover new
relevant patterns in the data, and become more flexible in dealing with novel
situations.Comment: Accepted by 14th Conference on Computer and Robot Visio
Deep Regionlets for Object Detection
In this paper, we propose a novel object detection framework named "Deep
Regionlets" by establishing a bridge between deep neural networks and
conventional detection schema for accurate generic object detection. Motivated
by the abilities of regionlets for modeling object deformation and multiple
aspect ratios, we incorporate regionlets into an end-to-end trainable deep
learning framework. The deep regionlets framework consists of a region
selection network and a deep regionlet learning module. Specifically, given a
detection bounding box proposal, the region selection network provides guidance
on where to select regions to learn the features from. The regionlet learning
module focuses on local feature selection and transformation to alleviate local
variations. To this end, we first realize non-rectangular region selection
within the detection framework to accommodate variations in object appearance.
Moreover, we design a "gating network" within the regionlet leaning module to
enable soft regionlet selection and pooling. The Deep Regionlets framework is
trained end-to-end without additional efforts. We perform ablation studies and
conduct extensive experiments on the PASCAL VOC and Microsoft COCO datasets.
The proposed framework outperforms state-of-the-art algorithms, such as
RetinaNet and Mask R-CNN, even without additional segmentation labels.Comment: Accepted to ECCV 201
Rectangular Layouts and Contact Graphs
Contact graphs of isothetic rectangles unify many concepts from applications
including VLSI and architectural design, computational geometry, and GIS.
Minimizing the area of their corresponding {\em rectangular layouts} is a key
problem. We study the area-optimization problem and show that it is NP-hard to
find a minimum-area rectangular layout of a given contact graph. We present
O(n)-time algorithms that construct -area rectangular layouts for
general contact graphs and -area rectangular layouts for trees.
(For trees, this is an -approximation algorithm.) We also present an
infinite family of graphs (rsp., trees) that require (rsp.,
) area.
We derive these results by presenting a new characterization of graphs that
admit rectangular layouts using the related concept of {\em rectangular duals}.
A corollary to our results relates the class of graphs that admit rectangular
layouts to {\em rectangle of influence drawings}.Comment: 28 pages, 13 figures, 55 references, 1 appendi
The Partial Visibility Representation Extension Problem
For a graph , a function is called a \emph{bar visibility
representation} of when for each vertex , is a
horizontal line segment (\emph{bar}) and iff there is an
unobstructed, vertical, -wide line of sight between and
. Graphs admitting such representations are well understood (via
simple characterizations) and recognizable in linear time. For a directed graph
, a bar visibility representation of , additionally, puts the bar
strictly below the bar for each directed edge of
. We study a generalization of the recognition problem where a function
defined on a subset of is given and the question is whether
there is a bar visibility representation of with for every . We show that for undirected graphs this problem
together with closely related problems are \NP-complete, but for certain cases
involving directed graphs it is solvable in polynomial time.Comment: Appears in the Proceedings of the 24th International Symposium on
Graph Drawing and Network Visualization (GD 2016
Cascaded Segmentation-Detection Networks for Word-Level Text Spotting
We introduce an algorithm for word-level text spotting that is able to
accurately and reliably determine the bounding regions of individual words of
text "in the wild". Our system is formed by the cascade of two convolutional
neural networks. The first network is fully convolutional and is in charge of
detecting areas containing text. This results in a very reliable but possibly
inaccurate segmentation of the input image. The second network (inspired by the
popular YOLO architecture) analyzes each segment produced in the first stage,
and predicts oriented rectangular regions containing individual words. No
post-processing (e.g. text line grouping) is necessary. With execution time of
450 ms for a 1000-by-560 image on a Titan X GPU, our system achieves the
highest score to date among published algorithms on the ICDAR 2015 Incidental
Scene Text dataset benchmark.Comment: 7 pages, 8 figure
Note on y-truncation: a simple approach to generating bounded distributions for environmental applications
It may sometimes be desirable to introduce bounds into probability distributions to formalise the presence of upper or lower physical limits to data to which the distribution has been applied. For example, an upper bound in raindrop sizes might be represented by introducing an upper bound to an exponential drop-size distribution. However, the standard method of truncating unbounded probability distributions yields distributions with non-zero probability density at the resulting bounds. In reality it is likely that physical bounding processes in nature increase in intensity as the bound is approached, causing a progressive decline in observation relative frequency to zero at the bound. Truncation below a y-axis point is proposed as a simple alternative means of creating more natural truncated probability distributions for application to data of this type. The resulting āy-truncatedā distributions have similarities with the traditional truncated distributions but probability densities have the desirable feature of always declining to zero at the bounds. In addition, the y-truncation approach can also serve in its own right as a mean
Epitaxial Frustration in Deposited Packings of Rigid Disks and Spheres
We use numerical simulation to investigate and analyze the way that rigid
disks and spheres arrange themselves when compressed next to incommensurate
substrates. For disks, a movable set is pressed into a jammed state against an
ordered fixed line of larger disks, where the diameter ratio of movable to
fixed disks is 0.8. The corresponding diameter ratio for the sphere simulations
is 0.7, where the fixed substrate has the structure of a (001) plane of a
face-centered cubic array. Results obtained for both disks and spheres exhibit
various forms of density-reducing packing frustration next to the
incommensurate substrate, including some cases displaying disorder that extends
far from the substrate. The disk system calculations strongly suggest that the
most efficient (highest density) packings involve configurations that are
periodic in the lateral direction parallel to the substrate, with substantial
geometric disruption only occurring near the substrate. Some evidence has also
emerged suggesting that for the sphere systems a corresponding structure doubly
periodic in the lateral directions would yield the highest packing density;
however all of the sphere simulations completed thus far produced some residual
"bulk" disorder not obviously resulting from substrate mismatch. In view of the
fact that the cases studied here represent only a small subset of all that
eventually deserve attention, we end with discussion of the directions in which
first extensions of the present simulations might profitably be pursued.Comment: 28 pages, 14 figures; typos fixed; a sentence added to 4th paragraph
of sect 5 in responce to a referee's comment
- ā¦