461 research outputs found
Deep Learning Techniques applied to Photometric Stereo
La tesi si focalizza sullo studio dello stato dell’arte della fotometria stereo con deep learning: Self-calibrating Deep Photometric Stereo Networks. Il modello è composto è composto di due reti, la prima predice la direzione e l’intensità delle luci, la seconda predice le normali della superficie. L’obiettivo della tesi è individuare i limiti del modello e capire se possa essere modifcato per avere buone prestazioni anche in scenari reali. Il progetto di tesi è basato su fine-tuning, una tecnica supervisionata di transfer learning. Per questo scopo un nuovo dataset è stato creato acquisendo immagini in laboratorio. La ground-truth è ottenuta tramite una tecnica di distillazione. In particolare la direzione delle luci è ottenuta utilizzando due algoritmi di calibrazione delle luci e unendo i due risultati. Analogamente le normali delle superfici sono ottenute unendo i risultati di vari algoritmi di fotometria stereo. I risultati della tesi sono molto promettenti. L’errore nella predizione della direzione e dell’intensità delle luci è un terzo dell’errore del modello originale. Le predizioni delle normali delle superfici possono essere analizzate solo qualitativamente, ma i miglioramenti sono evidenti. Il lavoro di questa tesi ha mostrato che è possibile applicare transfer-learning alla fotometria stereo con deep learning. Perciò non è necessario allenare un nuovo modello da zero ma è possibile approfittare di modelli già esistenti per migliorare le prestazioni e ridurre il tempo di allenamento
A Neural Height-Map Approach for the Binocular Photometric Stereo Problem
In this work we propose a novel, highly practical, binocular photometric
stereo (PS) framework, which has same acquisition speed as single view PS,
however significantly improves the quality of the estimated geometry.
As in recent neural multi-view shape estimation frameworks such as NeRF,
SIREN and inverse graphics approaches to multi-view photometric stereo (e.g.
PS-NeRF) we formulate shape estimation task as learning of a differentiable
surface and texture representation by minimising surface normal discrepancy for
normals estimated from multiple varying light images for two views as well as
discrepancy between rendered surface intensity and observed images. Our method
differs from typical multi-view shape estimation approaches in two key ways.
First, our surface is represented not as a volume but as a neural heightmap
where heights of points on a surface are computed by a deep neural network.
Second, instead of predicting an average intensity as PS-NeRF or introducing
lambertian material assumptions as Guo et al., we use a learnt BRDF and perform
near-field per point intensity rendering.
Our method achieves the state-of-the-art performance on the DiLiGenT-MV
dataset adapted to binocular stereo setup as well as a new binocular
photometric stereo dataset - LUCES-ST.Comment: WACV 202
A CNN Based Approach for the Point-Light Photometric Stereo Problem
Reconstructing the 3D shape of an object using several images under different
light sources is a very challenging task, especially when realistic assumptions
such as light propagation and attenuation, perspective viewing geometry and
specular light reflection are considered. Many of works tackling Photometric
Stereo (PS) problems often relax most of the aforementioned assumptions.
Especially they ignore specular reflection and global illumination effects. In
this work, we propose a CNN-based approach capable of handling these realistic
assumptions by leveraging recent improvements of deep neural networks for
far-field Photometric Stereo and adapt them to the point light setup. We
achieve this by employing an iterative procedure of point-light PS for shape
estimation which has two main steps. Firstly we train a per-pixel CNN to
predict surface normals from reflectance samples. Secondly, we compute the
depth by integrating the normal field in order to iteratively estimate light
directions and attenuation which is used to compensate the input images to
compute reflectance samples for the next iteration.
Our approach sigificantly outperforms the state-of-the-art on the DiLiGenT
real world dataset. Furthermore, in order to measure the performance of our
approach for near-field point-light source PS data, we introduce LUCES the
first real-world 'dataset for near-fieLd point light soUrCe photomEtric Stereo'
of 14 objects of different materials were the effects of point light sources
and perspective viewing are a lot more significant. Our approach also
outperforms the competition on this dataset as well. Data and test code are
available at the project page.Comment: arXiv admin note: text overlap with arXiv:2009.0579
Scalable, Detailed and Mask-Free Universal Photometric Stereo
In this paper, we introduce SDM-UniPS, a groundbreaking Scalable, Detailed,
Mask-free, and Universal Photometric Stereo network. Our approach can recover
astonishingly intricate surface normal maps, rivaling the quality of 3D
scanners, even when images are captured under unknown, spatially-varying
lighting conditions in uncontrolled environments. We have extended previous
universal photometric stereo networks to extract spatial-light features,
utilizing all available information in high-resolution input images and
accounting for non-local interactions among surface points. Moreover, we
present a new synthetic training dataset that encompasses a diverse range of
shapes, materials, and illumination scenarios found in real-world scenes.
Through extensive evaluation, we demonstrate that our method not only surpasses
calibrated, lighting-specific techniques on public benchmarks, but also excels
with a significantly smaller number of input images even without object masks.Comment: CVPR 2023 (Highlight). The source code will be available at
https://github.com/satoshi-ikehata/SDM-UniPS-CVPR202
Photometric Depth Super-Resolution
This study explores the use of photometric techniques (shape-from-shading and
uncalibrated photometric stereo) for upsampling the low-resolution depth map
from an RGB-D sensor to the higher resolution of the companion RGB image. A
single-shot variational approach is first put forward, which is effective as
long as the target's reflectance is piecewise-constant. It is then shown that
this dependency upon a specific reflectance model can be relaxed by focusing on
a specific class of objects (e.g., faces), and delegate reflectance estimation
to a deep neural network. A multi-shot strategy based on randomly varying
lighting conditions is eventually discussed. It requires no training or prior
on the reflectance, yet this comes at the price of a dedicated acquisition
setup. Both quantitative and qualitative evaluations illustrate the
effectiveness of the proposed methods on synthetic and real-world scenarios.Comment: IEEE Transactions on Pattern Analysis and Machine Intelligence
(T-PAMI), 2019. First three authors contribute equall
PS-Transformer: Learning Sparse Photometric Stereo Network using Self-Attention Mechanism
Existing deep calibrated photometric stereo networks basically aggregate
observations under different lights based on the pre-defined operations such as
linear projection and max pooling. While they are effective with the dense
capture, simple first-order operations often fail to capture the high-order
interactions among observations under small number of different lights. To
tackle this issue, this paper presents a deep sparse calibrated photometric
stereo network named {\it PS-Transformer} which leverages the learnable
self-attention mechanism to properly capture the complex inter-image
interactions. PS-Transformer builds upon the dual-branch design to explore both
pixel-wise and image-wise features and individual feature is trained with the
intermediate surface normal supervision to maximize geometric feasibility. A
new synthetic dataset named CyclesPS+ is also presented with the comprehensive
analysis to successfully train the photometric stereo networks. Extensive
results on the publicly available benchmark datasets demonstrate that the
surface normal prediction accuracy of the proposed method significantly
outperforms other state-of-the-art algorithms with the same number of input
images and is even comparable to that of dense algorithms which input
10 larger number of images.Comment: BMVC2021. Code and Supplementary are available at
https://github.com/satoshi-ikehata/PS-Transformer-BMVC202
- …