Search CORE

689 research outputs found

CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering

Author: DJ Butler
DJ Butler
E Reinhard
EH Land
Elena Garces
J Jeon
JT Barron
Nicolas Bonneel
Q Zhao
R Achanta
S Bell
S Bi
S Kim
SR Richter
Publication venue
Publication date: 05/12/2018
Field of study

Intrinsic image decomposition is a challenging, long-standing computer vision problem for which ground truth data is very difficult to acquire. We explore the use of synthetic data for training CNN-based intrinsic image decomposition models, then applying these learned models to real-world images. To that end, we present \ICG, a new, large-scale dataset of physically-based rendered images of scenes with full ground truth decompositions. The rendering process we use is carefully designed to yield high-quality, realistic images, which we find to be crucial for this problem domain. We also propose a new end-to-end training method that learns better decompositions by leveraging \ICG, and optionally IIW and SAW, two recent datasets of sparse annotations on real-world images. Surprisingly, we find that a decomposition network trained solely on our synthetic data outperforms the state-of-the-art on both IIW and SAW, and performance improves even further when IIW and SAW data is added during training. Our work demonstrates the suprising effectiveness of carefully-rendered synthetic data for the intrinsic images task.Comment: Paper for 'CGIntrinsics: Better Intrinsic Image Decomposition through Physically-Based Rendering' published in ECCV, 201

arXiv.org e-Print Archive

Crossref

Joint Learning of Intrinsic Images and Semantic Segmentation

Author: A Garcia-Garcia
A Kundu
C Wang
DJ Butler
EH Land
G Csurka
J Shotton
JT Barron
L Ladicky
M Everingham
Q Zhao
S Shafer
V Badrinarayanan
Y Matsushita
Publication venue
Publication date: 01/01/2018
Field of study

Semantic segmentation of outdoor scenes is problematic when there are variations in imaging conditions. It is known that albedo (reflectance) is invariant to all kinds of illumination effects. Thus, using reflectance images for semantic segmentation task can be favorable. Additionally, not only segmentation may benefit from reflectance, but also segmentation may be useful for reflectance computation. Therefore, in this paper, the tasks of semantic segmentation and intrinsic image decomposition are considered as a combined process by exploring their mutual relationship in a joint fashion. To that end, we propose a supervised end-to-end CNN architecture to jointly learn intrinsic image decomposition and semantic segmentation. We analyze the gains of addressing those two problems jointly. Moreover, new cascade CNN architectures for intrinsic-for-segmentation and segmentation-for-intrinsic are proposed as single tasks. Furthermore, a dataset of 35K synthetic images of natural environments is created with corresponding albedo and shading (intrinsics), as well as semantic labels (segmentation) assigned to each object/scene. The experiments show that joint learning of intrinsic image decomposition and semantic segmentation is beneficial for both tasks for natural scenes. Dataset and models are available at: https://ivi.fnwi.uva.nl/cv/intrinsegComment: ECCV 201

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Learning Shape Priors for Single-View 3D Completion and Reconstruction

Author: BK Horn
CB Choy
J Johnson
JT Barron
Jun-Yan Zhu
M Kazhdan
M Sung
Maxim Tatarchenko
Nathan Silberman
NJ Mitra
Q Huang
R Girdhar
R Zhang
S Bell
Y Li
Yu Xiang
Publication venue
Publication date: 13/09/2018
Field of study

The problem of single-view 3D shape completion or reconstruction is challenging, because among the many possible shapes that explain an observation, most are implausible and do not correspond to natural objects. Recent research in the field has tackled this problem by exploiting the expressiveness of deep convolutional networks. In fact, there is another level of ambiguity that is often overlooked: among plausible shapes, there are still multiple shapes that fit the 2D image equally well; i.e., the ground truth shape is non-deterministic given a single-view input. Existing fully supervised approaches fail to address this issue, and often produce blurry mean shapes with smooth surfaces but no fine details. In this paper, we propose ShapeHD, pushing the limit of single-view shape completion and reconstruction by integrating deep generative models with adversarially learned shape priors. The learned priors serve as a regularizer, penalizing the model only if its output is unrealistic, not if it deviates from the ground truth. Our design thus overcomes both levels of ambiguity aforementioned. Experiments demonstrate that ShapeHD outperforms state of the art by a large margin in both shape completion and shape reconstruction on multiple real datasets.Comment: ECCV 2018. The first two authors contributed equally to this work. Project page: http://shapehd.csail.mit.edu

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Learning Task-Specific Generalized Convolutions in the Permutohedral Lattice

Author: A Adams
A Recasens
C Russell
DJ Butler
J Kopf
J Wu
JT Barron
K He
L-C Chen
M Everingham
R Gadde
S Li
Y Li
Publication venue
Publication date: 01/09/2019
Field of study

Dense prediction tasks typically employ encoder-decoder architectures, but the prevalent convolutions in the decoder are not image-adaptive and can lead to boundary artifacts. Different generalized convolution operations have been introduced to counteract this. We go beyond these by leveraging guidance data to redefine their inherent notion of proximity. Our proposed network layer builds on the permutohedral lattice, which performs sparse convolutions in a high-dimensional space allowing for powerful non-local operations despite small filters. Multiple features with different characteristics span this permutohedral space. In contrast to prior work, we learn these features in a task-specific manner by generalizing the basic permutohedral operations to learnt feature representations. As the resulting objective is complex, a carefully designed framework and learning procedure are introduced, yielding rich feature embeddings in practice. We demonstrate the general applicability of our approach in different joint upsampling tasks. When adding our network layer to state-of-the-art networks for optical flow and semantic segmentation, boundary artifacts are removed and the accuracy is improved.Comment: To appear at GCPR 201

arXiv.org e-Print Archive

TUbiblio

Crossref

Decision Models and Technology Can Help Psychiatry Develop Biomarkers

Author: Baker JT
Barron DS
Budde KS
Bzdok D
Eickhoff SB
Fox PT
Friston KJ
Geha P
Heisig S
Holmes A
Krystal JH
Onnela J-P
Powers A
Silbersweig D
Publication venue: Frontiers Media SA
Publication date: 09/09/2021
Field of study

Why is psychiatry unable to define clinically useful biomarkers? We explore this question from the vantage of data and decision science and consider biomarkers as a form of phenotypic data that resolves a well-defined clinical decision. We introduce a framework that systematizes different forms of phenotypic data and further introduce the concept of decision model to describe the strategies a clinician uses to seek out, combine, and act on clinical data. Though many medical specialties rely on quantitative clinical data and operationalized decision models, we observe that, in psychiatry, clinical data are gathered and used in idiosyncratic decision models that exist solely in the clinician's mind and therefore are outside empirical evaluation. This, we argue, is a fundamental reason why psychiatry is unable to define clinically useful biomarkers: because psychiatry does not currently quantify clinical data, decision models cannot be operationalized and, in the absence of an operationalized decision model, it is impossible to define how a biomarker might be of use. Here, psychiatry might benefit from digital technologies that have recently emerged specifically to quantify clinically relevant facets of human behavior. We propose that digital tools might help psychiatry in two ways: first, by quantifying data already present in the standard clinical interaction and by allowing decision models to be operationalized and evaluated; second, by testing whether new forms of data might have value within an operationalized decision model. We reference successes from other medical specialties to illustrate how quantitative data and operationalized decision models improve patient care

UCL Discovery

AUTO3D: Novel view synthesis through unsupervisely learned variational viewpoint and global 3D representation

Author: A Dosovitskiy
CB Choy
G Gilboa
I Goodfellow
J Xie
JK Pontes
JT Barron
K Rematas
M Tatarchenko
N Kholgade
P Sturm
Q Du
R Garg
T Zhou
X Liu
X Liu
X Liu
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/08/2020
Field of study

This paper targets on learning-based novel view synthesis from a single or limited 2D images without the pose supervision. In the viewer-centered coordinates, we construct an end-to-end trainable conditional variational framework to disentangle the unsupervisely learned relative-pose/rotation and implicit global 3D representation (shape, texture and the origin of viewer-centered coordinates, etc.). The global appearance of the 3D object is given by several appearance-describing images taken from any number of viewpoints. Our spatial correlation module extracts a global 3D representation from the appearance-describing images in a permutation invariant manner. Our system can achieve implicitly 3D understanding without explicitly 3D reconstruction. With an unsupervisely learned viewer-centered relative-pose/rotation code, the decoder can hallucinate the novel view continuously by sampling the relative-pose in a prior distribution. In various applications, we demonstrate that our model can achieve comparable or even better results than pose/3D model-supervised learning-based novel view synthesis (NVS) methods with any number of input views.Comment: ECCV 202

arXiv.org e-Print Archive

Crossref

Self-supervised Outdoor Scene Relighting

Author: A Meka
A Meka
A Troccoli
A Wenger
C Loscos
G Patow
G Xing
J Kopf
J Philip
J Thies
JF Lalonde
JT Barron
O Ronneberger
P Debevec
PY Laffont
R Martin-Brualla
S Duchêne
V Masselus
Y Shih
Z Xu
Publication venue
Publication date: 01/01/2020
Field of study

Outdoor scene relighting is a challenging problem that requires good understanding of the scene geometry, illumination and albedo. Current techniques are completely supervised, requiring high quality synthetic renderings to train a solution. Such renderings are synthesized using priors learned from limited data. In contrast, we propose a self-supervised approach for relighting. Our approach is trained only on corpora of images collected from the internet without any user-supervision. This virtually endless source of training data allows training a general relighting solution. Our approach first decomposes an image into its albedo, geometry and illumination. A novel relighting is then produced by modifying the illumination parameters. Our solution capture shadow using a dedicated shadow prediction map, and does not rely on accurate geometry estimation. We evaluate our technique subjectively and objectively using a new dataset with ground-truth relighting. Results show the ability of our technique to produce photo-realistic and physically plausible results, that generalizes to unseen scenes.Comment: Published in ECCV '20, http://gvv.mpi-inf.mpg.de/projects/SelfRelight

arXiv.org e-Print Archive

Crossref

MPG.PuRe

Linking Ethnic Identification to Organisational Solidarity

Author: A Alesina
A Higazi
A Musa
AB Adeosun
B Kendhammer
B Wheaton
C Consiglio
C Hardin
C Olckers
CA Wong
DE Agbiboa
Dennis Gabriel Pepple
DG Pepple
DJ Terry
DR Brown
EE Osaghae
ES Ng
F D'Hondt
F Mael
G Rice
H Cohen
H Peng
H Tajfel
HW Marsh
I Rasul
I Winkler
J Liang
J Liu
J-L Farh
JA Grissom
JD Marvel
JF Hair
JL Hilton
JL Pierce
JL Pierce
JL Welbourne
JS Phinney
JS Phinney
JT Richardson
K Sanders
KA Bollen
KJ Meier
L-Q Yang
LG Barron
M Sherer
MAC Hwa
ME Gist
MEM Barak
MF Hamidullah
MW Browne
NR Branscombe
PM Bentler
PM Podsakoff
R Andrews
R Cropanzano
R Eisenberger
R Jessor
RD Caplan
RJ Bies
S Dawkins
S Hong
S-T Hou
TA Joiner
U Ukiwo
W Gera
W-T Tai
X Fan
Y Moskovich
Y Oruwari
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/12/2018
Field of study

Crossref

Coventry University Pure Portal

Occlusion-aware 3D Morphable Models and an Illumination Prior for Face Image Analysis

Author: Adam Kortylewski
Andreas Morel-Forster
Andreas Schneider
AP Dempster
Bernhard Egger
Bernhard Egger
Clemens Blumer
E Murphy-Chutorian
JT Barron
M Lüthi
MA Fischler
O Aldrian
R Basri
R Gross
Sandro Schönborn
Sandro Schönborn
Sandro Schönborn
Shunsuke Saito
SR Marschner
TF Chan
Thomas Vetter
Z Tu
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2018
Field of study

Faces in natural images are often occluded by a variety of objects. We propose a fully automated, probabilistic and occlusion-aware 3D morphable face model adaptation framework following an analysis-by-synthesis setup. The key idea is to segment the image into regions explained by separate models. Our framework includes a 3D morphable face model, a prototype-based beard model and a simple model for occlusions and background regions. The segmentation and all the model parameters have to be inferred from the single target image. Face model adaptation and segmentation are solved jointly using an expectation-maximization-like procedure. During the E-step, we update the segmentation and in the M-step the face model parameters are updated. For face model adaptation we apply a stochastic sampling strategy based on the Metropolis-Hastings algorithm. For segmentation, we apply loopy belief propagation for inference in a Markov random field. Illumination estimation is critical for occlusion handling. Our combined segmentation and model adaptation needs a proper initialization of the illumination parameters. We propose a RANSAC-based robust illumination estimation technique. By applying this method to a large face image database we obtain a first empirical distribution of real-world illumination conditions. The obtained empirical distribution is made publicly available and can be used as prior in probabilistic frameworks, for regularization or to synthesize data for deep learning methods

Crossref

edoc

Polarimetric Multi-View Inverse Rendering

Author: A Ghosh
A Ley
C Wu
CP Huynh
D Kiku
D Miyazaki
D Miyazaki
DG Lowe
GA Atkinson
GA Atkinson
GA Atkinson
GG Stokes
H Aanæs
H Bay
J Biehler
J Park
JL Schönberger
JT Barron
K Kim
M Kazhdan
M Li
R Zhang
S Ikehata
S Mihoubi
SH Baek
W Smith
Y Furukawa
Y Maruyama
Y Xiong
Publication venue
Publication date: 17/07/2020
Field of study

A polarization camera has great potential for 3D reconstruction since the angle of polarization (AoP) of reflected light is related to an object's surface normal. In this paper, we propose a novel 3D reconstruction method called Polarimetric Multi-View Inverse Rendering (Polarimetric MVIR) that effectively exploits geometric, photometric, and polarimetric cues extracted from input multi-view color polarization images. We first estimate camera poses and an initial 3D model by geometric reconstruction with a standard structure-from-motion and multi-view stereo pipeline. We then refine the initial model by optimizing photometric and polarimetric rendering errors using multi-view RGB and AoP images, where we propose a novel polarimetric rendering cost function that enables us to effectively constrain each estimated surface vertex's normal while considering four possible ambiguous azimuth angles revealed from the AoP measurement. Experimental results using both synthetic and real data demonstrate that our Polarimetric MVIR can reconstruct a detailed 3D shape without assuming a specific polarized reflection depending on the material.Comment: Paper accepted in ECCV 202

arXiv.org e-Print Archive

Crossref