2,669 research outputs found
Pushing the Boundaries of Boundary Detection using Deep Learning
In this work we show that adapting Deep Convolutional Neural Network training
to the task of boundary detection can result in substantial improvements over
the current state-of-the-art in boundary detection.
Our contributions consist firstly in combining a careful design of the loss
for boundary detection training, a multi-resolution architecture and training
with external data to improve the detection accuracy of the current state of
the art. When measured on the standard Berkeley Segmentation Dataset, we
improve theoptimal dataset scale F-measure from 0.780 to 0.808 - while human
performance is at 0.803. We further improve performance to 0.813 by combining
deep learning with grouping, integrating the Normalized Cuts technique within a
deep network.
We also examine the potential of our boundary detector in conjunction with
the task of semantic segmentation and demonstrate clear improvements over
state-of-the-art systems. Our detector is fully integrated in the popular Caffe
framework and processes a 320x420 image in less than a second.Comment: The previous version reported large improvements w.r.t. the LPO
region proposal baseline, which turned out to be due to a wrong computation
for the baseline. The improvements are currently less important, and are
omitted. We are sorry if the reported results caused any confusion. We have
also integrated reviewer feedback regarding human performance on the BSD
benchmar
Recommended from our members
The use of singular functions for the approximate conformal mapping of doubly-connected domains
Let f be the function which maps conformally a given doubly- connected domain onto a circular annulus. We consider the use of two closely related methods for determining approximations to f of the form
fn (z) = z exp, ⎪⎩⎪⎨⎧âŽâŽ¬âŽ«Î£âˆ’(z)uan1jjj
where {uj} is a set of basis functions. The two methods are respectively a variational method, based on an extremum property of the function
H(z) = f′(z)/f(z) - 1/z,
and an orthononnalization method, based on approximating the function H by a finite Fourier series sum.
The main purpose of the paper is to consider the use of the two methods for the mapping of domains having sharp corners, where corner singularities occur. We show, by means of numerical examples, that both methods are capable of producing approximations of high accuracy for the mapping of such "difficult" doubly-connected domains. The essential requirement for this is that the basis set {uj} contains singular functions that reflect the asymptotic behaviour of the function H in the neighbourhood of each "singular" corner
Recommended from our members
Two numerical methods for the conformal mapping of simply-connected domains
Structural Attention Neural Networks for improved sentiment analysis
We introduce a tree-structured attention neural network for sentences and
small phrases and apply it to the problem of sentiment classification. Our
model expands the current recursive models by incorporating structural
information around a node of a syntactic tree using both bottom-up and top-down
information propagation. Also, the model utilizes structural attention to
identify the most salient representations during the construction of the
syntactic tree. To our knowledge, the proposed models achieve state of the art
performance on the Stanford Sentiment Treebank dataset.Comment: Submitted to EACL2017 for revie
Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
In this work we propose a structured prediction technique that combines the
virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a)
our structured prediction task has a unique global optimum that is obtained
exactly from the solution of a linear system (b) the gradients of our model
parameters are analytically computed using closed form expressions, in contrast
to the memory-demanding contemporary deep structured prediction approaches that
rely on back-propagation-through-time, (c) our pairwise terms do not have to be
simple hand-crafted expressions, as in the line of works building on the
DenseCRF, but can rather be `discovered' from data through deep architectures,
and (d) out system can trained in an end-to-end manner. Building on standard
tools from numerical analysis we develop very efficient algorithms for
inference and learning, as well as a customized technique adapted to the
semantic segmentation task. This efficiency allows us to explore more
sophisticated architectures for structured prediction in deep learning: we
introduce multi-resolution architectures to couple information across scales in
a joint optimization framework, yielding systematic improvements. We
demonstrate the utility of our approach on the challenging VOC PASCAL 2012
image segmentation benchmark, showing substantial improvements over strong
baselines. We make all of our code and experiments available at
{https://github.com/siddharthachandra/gcrf}Comment: Our code is available at https://github.com/siddharthachandra/gcr
Recommended from our members
Numerical conformal mapping of exterior domains
The work of the present paper is closely related to the two numerical procedures described in [11], for determining approximations to the function which maps conformally a bounded simply-connected domain Ω1 , with boundary ∂Ω, onto the unit disc. Here, we consider the use of these procedures for the solution of the corresponding exterior problem, i.e. the problem of determining approximations to the mapping function which maps conformally the exterior domain Ω = compl(ΩI⋃∂Ω) onto the unit disc
Mass Displacement Networks
Despite the large improvements in performance attained by using deep learning
in computer vision, one can often further improve results with some additional
post-processing that exploits the geometric nature of the underlying task. This
commonly involves displacing the posterior distribution of a CNN in a way that
makes it more appropriate for the task at hand, e.g. better aligned with local
image features, or more compact. In this work we integrate this geometric
post-processing within a deep architecture, introducing a differentiable and
probabilistically sound counterpart to the common geometric voting technique
used for evidence accumulation in vision. We refer to the resulting neural
models as Mass Displacement Networks (MDNs), and apply them to human pose
estimation in two distinct setups: (a) landmark localization, where we collapse
a distribution to a point, allowing for precise localization of body keypoints
and (b) communication across body parts, where we transfer evidence from one
part to the other, allowing for a globally consistent pose estimate. We
evaluate on large-scale pose estimation benchmarks, such as MPII Human Pose and
COCO datasets, and report systematic improvements when compared to strong
baselines.Comment: 12 pages, 4 figure
To The Point: Correspondence-driven monocular 3D category reconstruction
We present To The Point (TTP), a method for reconstructing 3D objects from a single image using 2D to 3D correspondences learned from weak supervision. We recover a 3D shape from a 2D image by first regressing the 2D positions corresponding to the 3D template vertices and then jointly estimating a rigid camera transform and non-rigid template deformation that optimally explain the 2D positions through the 3D shape projection. By relying on 3D-2D correspondences we use a simple per-sample optimization problem to replace CNN-based regression of camera pose and non-rigid deformation and thereby obtain substantially more accurate 3D reconstructions. We treat this optimization as a differentiable layer and train the whole system in an end-to-end manner. We report systematic quantitative improvements on multiple categories and provide qualitative results comprising diverse shape, pose and texture prediction examples. Project website: https://fkokkinos.github.io/to_the_point
- …