Search CORE

1,431 research outputs found

Evaluation of CNN-based Single-Image Depth Estimation Methods

Author: A Saxena
Arno Knapitsch
F Liu
N Silberman
P Dollár
R Garg
S Kim
Publication venue
Publication date: 01/01/2018
Field of study

While an increasing interest in deep models for single-image depth estimation methods can be observed, established schemes for their evaluation are still limited. We propose a set of novel quality criteria, allowing for a more detailed analysis by focusing on specific characteristics of depth maps. In particular, we address the preservation of edges and planar regions, depth consistency, and absolute distance accuracy. In order to employ these metrics to evaluate and compare state-of-the-art single-image depth estimation approaches, we provide a new high-quality RGB-D dataset. We used a DSLR camera together with a laser scanner to acquire high-resolution images and highly accurate depth maps. Experimental results show the validity of our proposed evaluation protocol

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Crossref

Depth Estimation via Affinity Learned with Convolutional Spatial Propagation Network

Author: C Dong
FR Kschischang
G Riegler
J Yang
N Silberman
Olaf Ronneberger
R Liu
T Igarashi
T-W Hui
X Song
Publication venue
Publication date: 31/07/2018
Field of study

Depth estimation from a single image is a fundamental problem in computer vision. In this paper, we propose a simple yet effective convolutional spatial propagation network (CSPN) to learn the affinity matrix for depth prediction. Specifically, we adopt an efficient linear propagation model, where the propagation is performed with a manner of recurrent convolutional operation, and the affinity among neighboring pixels is learned through a deep convolutional neural network (CNN). We apply the designed CSPN to two depth estimation tasks given a single image: (1) To refine the depth output from state-of-the-art (SOTA) existing methods; and (2) to convert sparse depth samples to a dense depth map by embedding the depth samples within the propagation procedure. The second task is inspired by the availability of LIDARs that provides sparse but accurate depth measurements. We experimented the proposed CSPN over two popular benchmarks for depth estimation, i.e. NYU v2 and KITTI, where we show that our proposed approach improves in not only quality (e.g., 30% more reduction in depth error), but also speed (e.g., 2 to 5 times faster) than prior SOTA methods.Comment: 14 pages, 8 figures, ECCV 201

arXiv.org e-Print Archive

Crossref

Assessing the Quality of Actions

Author: B. Yao
J.C. Niebles
M. Jug
N. Silberman
R. Datta
R. Poppe
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

While recent advances in computer vision have provided reliable methods to recognize actions in both images and videos, the problem of assessing how well people perform actions has been largely unexplored in computer vision. Since methods for assessing action quality have many real-world applications in healthcare, sports, and video retrieval, we believe the computer vision community should begin to tackle this challenging problem. To spur progress, we introduce a learning-based framework that takes steps towards assessing how well people perform actions in videos. Our approach works by training a regression model from spatiotemporal pose features to scores obtained from expert judges. Moreover, our approach can provide interpretable feedback on how people can improve their action. We evaluate our method on a new Olympic sports dataset, and our experiments suggest our framework is able to rank the athletes more accurately than a non-expert human. While promising, our method is still a long way to rivaling the performance of expert judges, indicating that there is significant opportunity in computer vision research to improve on this difficult yet important task.National Science Foundation (U.S.). Graduate Research FellowshipGoogle (Firm) (Research Award)United States. Office of Naval Research. Multidisciplinary University Research Initiative (N000141010933

DSpace@MIT

Crossref

Learning Shape Priors for Single-View 3D Completion and Reconstruction

Author: BK Horn
CB Choy
J Johnson
JT Barron
Jun-Yan Zhu
M Kazhdan
M Sung
Maxim Tatarchenko
Nathan Silberman
NJ Mitra
Q Huang
R Girdhar
R Zhang
S Bell
Y Li
Yu Xiang
Publication venue
Publication date: 13/09/2018
Field of study

The problem of single-view 3D shape completion or reconstruction is challenging, because among the many possible shapes that explain an observation, most are implausible and do not correspond to natural objects. Recent research in the field has tackled this problem by exploiting the expressiveness of deep convolutional networks. In fact, there is another level of ambiguity that is often overlooked: among plausible shapes, there are still multiple shapes that fit the 2D image equally well; i.e., the ground truth shape is non-deterministic given a single-view input. Existing fully supervised approaches fail to address this issue, and often produce blurry mean shapes with smooth surfaces but no fine details. In this paper, we propose ShapeHD, pushing the limit of single-view shape completion and reconstruction by integrating deep generative models with adversarially learned shape priors. The learned priors serve as a regularizer, penalizing the model only if its output is unrealistic, not if it deviates from the ground truth. Our design thus overcomes both levels of ambiguity aforementioned. Experiments demonstrate that ShapeHD outperforms state of the art by a large margin in both shape completion and shape reconstruction on multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work. Project page: http://shapehd.csail.mit.edu

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Ariel - Volume 2 Number 6

Author: Ager Steven Allen
Case Jr., Delvyn C.
Dickman Shep
Dooley James R.
Eccleston Donald L.
Farnhoff Paul
Flynn Stephen P.
Miller Eugenia
Porter Lynne
Redka James
Silberman Carl
Williams Tom
Publication venue: 'Thomas Jefferson University'
Publication date: 01/03/1970
Field of study

Editors Richard J. Bonanno Robin A. Edwards Associate Editors Steven Ager Stephen Flynn Shep Dickman Tom Williams Lay-out Editor Eugenia Miller Contributing Editors Michael J. Blecker W. Cherry Light James J. Nocon Lynne Porter Editors Emeritus Delvyn C. Case, Jr. Paul M. Fernhof

Jefferson Digital Commons

Estimating Depth from RGB and Sparse Sensing

Author: A Teichman
Anat Levin
DG Lowe
FH Sinz
K Khoshelham
N Silberman
R Garg
R Horaud
T Whelan
T-W Hui
V Badrinarayanan
X Song
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 09/12/2018
Field of study

We present a deep model that can accurately produce dense depth maps given an RGB image with known depth at a very sparse set of pixels. The model works simultaneously for both indoor/outdoor scenes and produces state-of-the-art dense depth maps at nearly real-time speeds on both the NYUv2 and KITTI datasets. We surpass the state-of-the-art for monocular depth estimation even with depth values for only 1 out of every ~10000 image pixels, and we outperform other sparse-to-dense depth methods at all sparsity levels. With depth values for 1/256 of the image pixels, we achieve a mean absolute error of less than 1% of actual depth on indoor scenes, comparable to the performance of consumer-grade depth sensor hardware. Our experiments demonstrate that it would indeed be possible to efficiently transform sparse depth measurements obtained using e.g. lower-power depth sensors or SLAM systems into high-quality dense depth maps.Comment: European Conference on Computer Vision (ECCV) 2018. Updated to camera-ready version with additional experiment

arXiv.org e-Print Archive

Crossref

Generic 3D Representation via Pose Estimation and Matching

Author: B Caprile
B Li
C Xu
D Tell
DG Lowe
EJ Gibson
H Bay
J Matas
J Weston
JM Morel
K Köser
K Mikolajczyk
Karen Simonyan
L Smith
L Van der Maaten
M Brown
MJ Tarr
N Silberman
Nancy Rader
P Denis
P Moreels
R Hartley
R Held
R Kümmerle
S Agarwal
Z Zhang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/10/2017
Field of study

Though a large body of computer vision research has investigated developing generic semantic representations, efforts towards developing a similar representation for 3D has been limited. In this paper, we learn a generic 3D representation through solving a set of foundational proxy 3D tasks: object-centric camera pose estimation and wide baseline feature matching. Our method is based upon the premise that by providing supervision over a set of carefully selected foundational tasks, generalization to novel tasks and abstraction capabilities can be achieved. We empirically show that the internal representation of a multi-task ConvNet trained to solve the above core problems generalizes to novel 3D tasks (e.g., scene layout estimation, object pose estimation, surface normal estimation) without the need for fine-tuning and shows traits of abstraction abilities (e.g., cross-modality pose estimation). In the context of the core supervised tasks, we demonstrate our representation achieves state-of-the-art wide baseline feature matching results without requiring apriori rectification (unlike SIFT and the majority of learned features). We also show 6DOF camera pose estimation given a pair local image patches. The accuracy of both supervised tasks come comparable to humans. Finally, we contribute a large-scale dataset composed of object-centric street view scenes along with point correspondences and camera pose information, and conclude with a discussion on the learned representation and open research questions.Comment: Published in ECCV16. See the project website http://3drepresentation.stanford.edu/ and dataset website https://github.com/amir32002/3D_Street_Vie

arXiv.org e-Print Archive

Crossref

Accurate and linear time pose estimation from points and lines

Author: A Ansar
A Penate-Sanchez
CP Lu
D DeMenthon
DG Lowe
E Rosten
H Bay
L Kneip
L Quan
L Zhang
M Dhome
MA Fischler
N Silberman
PD Fiore
R Hartley
S Li
V Lepetit
XS Gao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The final publication is available at link.springer.comThe Perspective-n-Point (PnP) problem seeks to estimate the pose of a calibrated camera from n 3Dto-2D point correspondences. There are situations, though, where PnP solutions are prone to fail because feature point correspondences cannot be reliably estimated (e.g. scenes with repetitive patterns or with low texture). In such scenarios, one can still exploit alternative geometric entities, such as lines, yielding the so-called Perspective-n-Line (PnL) algorithms. Unfortunately, existing PnL solutions are not as accurate and efficient as their point-based counterparts. In this paper we propose a novel approach to introduce 3D-to-2D line correspondences into a PnP formulation, allowing to simultaneously process points and lines. For this purpose we introduce an algebraic line error that can be formulated as linear constraints on the line endpoints, even when these are not directly observable. These constraints can then be naturally integrated within the linear formulations of two state-of-the-art point-based algorithms, the OPnP and the EPnP, allowing them to indistinctly handle points, lines, or a combination of them. Exhaustive experiments show that the proposed formulation brings remarkable boost in performance compared to only point or only line based solutions, with a negligible computational overhead compared to the original OPnP and EPnP.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

UPCommons (Universitat Politècnica de Catalunya)

Mass equidistribution of Hilbert modular eigenforms

Author: A. Hildebrand
A. Ichino
A. Venkatesh
A. Weil
A.I. Šnirel’man
D. Blasius
E. Kowalski
E. Lindenstrauss
E.C. Titchmarsh
E.T. Whittaker
G. Greaves
G. Shimura
G. Shimura
H. Davenport
H. Iwaniec
H. Iwaniec
H. Jacquet
H. Jacquet
H.L. Montgomery
I.S. Gradshteyn
J. Hoffstein
J.-P. Labesse
J.G. Hinz
K. Soundararajan
K. Soundararajan
L. Silberman
M. Harris
M. Nair
M. Nair
P. Sarnak
Paul D. Nelson
R. Holowinsky
R. Holowinsky
R. Holowinsky
S. Gelbart
S. Gelbart
S. Zelditch
Uwe Krause
V. Blomer
W. Luo
W. Luo
W. Schaal
Y. Colin de Verdière
Z. Rudnick
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 28/02/2012
Field of study

Let F be a totally real number field, and let f traverse a sequence of non-dihedral holomorphic eigencuspforms on GL(2)/F of weight (k_1,...,k_n), trivial central character and full level. We show that the mass of f equidistributes on the Hilbert modular variety as max(k_1,...,k_n) tends to infinity. Our result answers affirmatively a natural analogue of a conjecture of Rudnick and Sarnak (1994). Our proof generalizes the argument of Holowinsky-Soundararajan (2008) who established the case F = Q. The essential difficulty in doing so is to adapt Holowinsky's bounds for the Weyl periods of the equidistribution problem in terms of manageable shifted convolution sums of Fourier coefficients to the case of a number field with nontrivial unit group.Comment: 40 pages; typos corrected, nearly accepted for

arXiv.org e-Print Archive

Crossref

International Governance of Autonomous Military Robots

Author: Allenby Braden
Arkin Ronald
Barrett Edward T.
Borenstein Jason
Gaudet Lyn M.
Kittrie Orde
Lin Patrick
Lucas George R.
Marchant Gary E.
O'Meara Richard
Silberman Jared
Publication venue
Publication date: 01/01/2011
Field of study

New technologies have always been a critical component of military strategy and preparedness. One new technology on the not-too-distant technological horizon is lethal autonomous robotics, which would consist of robotic weapons capable of exerting lethal force without human control or intervention. There are a number of operational and tactical factors that create incentives for the development of such lethal systems as the next step in the current development, deployment and use of autonomous systems in military forces. Yet, such robotic systems would raise a number of potential operational, policy, ethical and legal issues. This article summarizes the current status and incentives for the development of lethal autonomous robots, discusses some of the issues that would be raised by such systems, and calls for a national and international dialogue on appropriate governance of such systems before they are deployed. The article reviews potential modes of governance, ranging from ethical principles implemented through modifications or refinements of national policies, to changes in the law of war and rules of engagement, to international treaties or agreements, or to a variety of other "soft law" governance mechanisms

GT Digital Repository (Georgia Tech)

Columbia University Academic Commons