340 research outputs found
Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network
Depth estimation from a single image is a fundamental problem in computer
vision. In this paper, we propose a simple yet effective convolutional spatial
propagation network (CSPN) to learn the affinity matrix for depth prediction.
Specifically, we adopt an efficient linear propagation model, where the
propagation is performed with a manner of recurrent convolutional operation,
and the affinity among neighboring pixels is learned through a deep
convolutional neural network (CNN). We apply the designed CSPN to two depth
estimation tasks given a single image: (1) To refine the depth output from
state-of-the-art (SOTA) existing methods; and (2) to convert sparse depth
samples to a dense depth map by embedding the depth samples within the
propagation procedure. The second task is inspired by the availability of
LIDARs that provides sparse but accurate depth measurements. We experimented
the proposed CSPN over two popular benchmarks for depth estimation, i.e. NYU v2
and KITTI, where we show that our proposed approach improves in not only
quality (e.g., 30% more reduction in depth error), but also speed (e.g., 2 to 5
times faster) than prior SOTA methods.Comment: 14 pages, 8 figures, ECCV 201
Monotone discretizations of levelset convex geometric PDEs
We introduce a novel algorithm that converges to level-set convex viscosity
solutions of high-dimensional Hamilton-Jacobi equations. The algorithm is
applicable to a broad class of curvature motion PDEs, as well as a recently
developed Hamilton-Jacobi equation for the Tukey depth, which is a statistical
depth measure of data points. A main contribution of our work is a new monotone
scheme for approximating the direction of the gradient, which allows for
monotone discretizations of pure partial derivatives in the direction of, and
orthogonal to, the gradient. We provide a convergence analysis of the algorithm
on both regular Cartesian grids and unstructured point clouds in any dimension
and present numerical experiments that demonstrate the effectiveness of the
algorithm in approximating solutions of the affine flow in two dimensions and
the Tukey depth measure of high-dimensional datasets such as MNIST and
FashionMNIST.Comment: 42 pages including reference
Physics-Informed Computer Vision: A Review and Perspectives
Incorporation of physical information in machine learning frameworks are
opening and transforming many application domains. Here the learning process is
augmented through the induction of fundamental knowledge and governing physical
laws. In this work we explore their utility for computer vision tasks in
interpreting and understanding visual data. We present a systematic literature
review of formulation and approaches to computer vision tasks guided by
physical laws. We begin by decomposing the popular computer vision pipeline
into a taxonomy of stages and investigate approaches to incorporate governing
physical equations in each stage. Existing approaches in each task are analyzed
with regard to what governing physical processes are modeled, formulated and
how they are incorporated, i.e. modify data (observation bias), modify networks
(inductive bias), and modify losses (learning bias). The taxonomy offers a
unified view of the application of the physics-informed capability,
highlighting where physics-informed learning has been conducted and where the
gaps and opportunities are. Finally, we highlight open problems and challenges
to inform future research. While still in its early days, the study of
physics-informed computer vision has the promise to develop better computer
vision models that can improve physical plausibility, accuracy, data efficiency
and generalization in increasingly realistic applications
AdvMIL: Adversarial Multiple Instance Learning for the Survival Analysis on Whole-Slide Images
The survival analysis on histological whole-slide images (WSIs) is one of the
most important means to estimate patient prognosis. Although many
weakly-supervised deep learning models have been developed for gigapixel WSIs,
their potential is generally restricted by classical survival analysis rules
and fully-supervision requirements. As a result, these models provide patients
only with a completely-certain point estimation of time-to-event, and they
could only learn from the well-annotated WSI data currently at a small scale.
To tackle these problems, we propose a novel adversarial multiple instance
learning (AdvMIL) framework. This framework is based on adversarial
time-to-event modeling, and it integrates the multiple instance learning (MIL)
that is much necessary for WSI representation learning. It is a plug-and-play
one, so that most existing WSI-based models with embedding-level MIL networks
can be easily upgraded by applying this framework, gaining the improved ability
of survival distribution estimation and semi-supervised learning. Our extensive
experiments show that AdvMIL could not only bring performance improvement to
mainstream WSI models at a relatively low computational cost, but also enable
these models to learn from unlabeled data with semi-supervised learning. Our
AdvMIL framework could promote the research of time-to-event modeling in
computational pathology with its novel paradigm of adversarial MIL.Comment: 13 pages, 10 figures, 8 table
A Neural Network Approach for Real-Time High-Dimensional Optimal Control
We propose a neural network approach for solving high-dimensional optimal
control problems arising in real-time applications. Our approach yields
controls in a feedback form, where the policy function is given by a neural
network (NN). Specifically, we fuse the Hamilton-Jacobi-Bellman (HJB) and
Pontryagin Maximum Principle (PMP) approaches by parameterizing the value
function with an NN. We can therefore synthesize controls in real-time without
having to solve an optimization problem. Once the policy function is trained,
generating a control at a given space-time location takes milliseconds; in
contrast, efficient nonlinear programming methods typically perform the same
task in seconds. We train the NN offline using the objective function of the
control problem and penalty terms that enforce the HJB equations. Therefore,
our training algorithm does not involve data generated by another algorithm. By
training on a distribution of initial states, we ensure the controls'
optimality on a large portion of the state-space. Our grid-free approach scales
efficiently to dimensions where grids become impractical or infeasible. We
demonstrate the effectiveness of our approach on several multi-agent
collision-avoidance problems in up to 150 dimensions. Furthermore, we
empirically observe that the number of parameters in our approach scales
linearly with the dimension of the control problem, thereby mitigating the
curse of dimensionality.Comment: 16 pages, 12 figures. This work has been submitted for possible
publication. Copyright may be transferred without notice, after which this
version may no longer be availabl
3D Scene Geometry Estimation from 360 Imagery: A Survey
This paper provides a comprehensive survey on pioneer and state-of-the-art 3D
scene geometry estimation methodologies based on single, two, or multiple
images captured under the omnidirectional optics. We first revisit the basic
concepts of the spherical camera model, and review the most common acquisition
technologies and representation formats suitable for omnidirectional (also
called 360, spherical or panoramic) images and videos. We then survey
monocular layout and depth inference approaches, highlighting the recent
advances in learning-based solutions suited for spherical data. The classical
stereo matching is then revised on the spherical domain, where methodologies
for detecting and describing sparse and dense features become crucial. The
stereo matching concepts are then extrapolated for multiple view camera setups,
categorizing them among light fields, multi-view stereo, and structure from
motion (or visual simultaneous localization and mapping). We also compile and
discuss commonly adopted datasets and figures of merit indicated for each
purpose and list recent results for completeness. We conclude this paper by
pointing out current and future trends.Comment: Published in ACM Computing Survey
- …