Search CORE

54,774 research outputs found

Joint Prediction of Depths, Normals and Surface Curvature from RGB Images using CNNs

Author: Dharmasiri Thanuja
Drummond Tom
Spek Andrew
Publication venue
Publication date: 23/06/2017
Field of study

Understanding the 3D structure of a scene is of vital importance, when it comes to developing fully autonomous robots. To this end, we present a novel deep learning based framework that estimates depth, surface normals and surface curvature by only using a single RGB image. To the best of our knowledge this is the first work to estimate surface curvature from colour using a machine learning approach. Additionally, we demonstrate that by tuning the network to infer well designed features, such as surface curvature, we can achieve improved performance at estimating depth and normals.This indicates that network guidance is still a useful aspect of designing and training a neural network. We run extensive experiments where the network is trained to infer different tasks while the model capacity is kept constant resulting in different feature maps based on the tasks at hand. We outperform the previous state-of-the-art benchmarks which jointly estimate depths and surface normals while predicting surface curvature in parallel

arXiv.org e-Print Archive

Crossref

Unsupervised Learning of Visual Representations using Videos

Author: Gupta Abhinav
Wang Xiaolong
Publication venue
Publication date: 06/10/2015
Field of study

Is strong supervision necessary for learning a good visual representation? Do we really need millions of semantically-labeled images to train a Convolutional Neural Network (CNN)? In this paper, we present a simple yet surprisingly powerful approach for unsupervised learning of CNN. Specifically, we use hundreds of thousands of unlabeled videos from the web to learn visual representations. Our key idea is that visual tracking provides the supervision. That is, two patches connected by a track should have similar visual representation in deep feature space since they probably belong to the same object or object part. We design a Siamese-triplet network with a ranking loss function to train this CNN representation. Without using a single image from ImageNet, just using 100K unlabeled videos and the VOC 2012 dataset, we train an ensemble of unsupervised networks that achieves 52% mAP (no bounding box regression). This performance comes tantalizingly close to its ImageNet-supervised counterpart, an ensemble which achieves a mAP of 54.4%. We also show that our unsupervised network can perform competitively in other tasks such as surface-normal estimation

arXiv.org e-Print Archive

Crossref

Deep Reflectance Maps

Author: Fritz Mario
Gavves Efstratios
Rematas Konstantinos
Ritschel Tobias
Tuytelaars Tinne
Publication venue
Publication date: 01/01/2015
Field of study

Undoing the image formation process and therefore decomposing appearance into its intrinsic properties is a challenging task due to the under-constraint nature of this inverse problem. While significant progress has been made on inferring shape, materials and illumination from images only, progress in an unconstrained setting is still limited. We propose a convolutional neural architecture to estimate reflectance maps of specular materials in natural lighting conditions. We achieve this in an end-to-end learning formulation that directly predicts a reflectance map from the image itself. We show how to improve estimates by facilitating additional supervision in an indirect scheme that first predicts surface orientation and afterwards predicts the reflectance map by a learning-based sparse data interpolation. In order to analyze performance on this difficult task, we propose a new challenge of Specular MAterials on SHapes with complex IllumiNation (SMASHINg) using both synthetic and real images. Furthermore, we show the application of our method to a range of image-based editing tasks on real images.Comment: project page: http://homes.esat.kuleuven.be/~krematas/DRM

arXiv.org e-Print Archive

CiteSeerX

CISPA – Helmholtz-Zentrum für Informationssicherheit

Crossref

MPG.PuRe

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View

Author: Agudo Antonio
Lepetit Vincent
Moreno-Noguer Francesc
Porzi Lorenzo
Pumarola Albert
Sanfeliu Alberto
Publication venue
Publication date: 01/01/2018
Field of study

We propose a method for predicting the 3D shape of a deformable surface from a single view. By contrast with previous approaches, we do not need a pre-registered template of the surface, and our method is robust to the lack of texture and partial occlusions. At the core of our approach is a {\it geometry-aware} deep architecture that tackles the problem as usually done in analytic solutions: first perform 2D detection of the mesh and then estimate a 3D shape that is geometrically consistent with the image. We train this architecture in an end-to-end manner using a large dataset of synthetic renderings of shapes under different levels of deformation, material properties, textures and lighting conditions. We evaluate our approach on a test split of this dataset and available real benchmarks, consistently improving state-of-the-art solutions with a significantly lower computational time.Comment: Accepted at CVPR 201

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

HAL Descartes

Digital.CSIC

Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on n-Spheres

Author: Gavves Efstratios
Liao Shuai
Snoek Cees G. M.
Publication venue
Publication date: 01/01/2019
Field of study

Many computer vision challenges require continuous outputs, but tend to be solved by discrete classification. The reason is classification's natural containment within a probability

n

-simplex, as defined by the popular softmax activation function. Regular regression lacks such a closed geometry, leading to unstable training and convergence to suboptimal local minima. Starting from this insight we revisit regression in convolutional neural networks. We observe many continuous output problems in computer vision are naturally contained in closed geometrical manifolds, like the Euler angles in viewpoint estimation or the normals in surface normal estimation. A natural framework for posing such continuous output problems are

n

-spheres, which are naturally closed geometric manifolds defined in the

\mathbb{R}^{(n+1)}

space. By introducing a spherical exponential mapping on

n

-spheres at the regression output, we obtain well-behaved gradients, leading to stable training. We show how our spherical regression can be utilized for several computer vision challenges, specifically viewpoint estimation, surface normal estimation and 3D rotation estimation. For all these problems our experiments demonstrate the benefit of spherical regression. All paper resources are available at https://github.com/leoshine/Spherical_Regression.Comment: CVPR 2019 camera read

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE