Search CORE

456 research outputs found

ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

Author: Baslamisli Anil S.
Das Partha
Gevers Theo
Karaoglu Sezer
Le Hoang-An
Publication venue
Publication date: 21/01/2021
Field of study

In general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.Comment: Submitted to International Journal of Computer Vision (IJCV

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

A Dataset of Multi-Illumination Images in the Wild

Author: Aittala Miika
Durand Fredo
Gharbi Michael
Murmann Lukas
Publication venue
Publication date: 17/10/2019
Field of study

Collections of images under a single, uncontrolled illumination have enabled the rapid advancement of core computer vision tasks like classification, detection, and segmentation. But even with modern learning techniques, many inverse problems involving lighting and material understanding remain too severely ill-posed to be solved with single-illumination datasets. To fill this gap, we introduce a new multi-illumination dataset of more than 1000 real scenes, each captured under 25 lighting conditions. We demonstrate the richness of this dataset by training state-of-the-art models for three challenging applications: single-image illumination estimation, image relighting, and mixed-illuminant white balance.Comment: ICCV 201

arXiv.org e-Print Archive

DSpace@MIT

Free-viewpoint Indoor Neural Relighting from Multi-view Stereo

Author: Drettakis George
Gharbi Michaël
Morgenthaler Sébastien
Philip Julien
Publication venue
Publication date: 01/01/2021
Field of study

We introduce a neural relighting algorithm for captured indoors scenes, that allows interactive free-viewpoint navigation. Our method allows illumination to be changed synthetically, while coherently rendering cast shadows and complex glossy materials. We start with multiple images of the scene and a 3D mesh obtained by multi-view stereo (MVS) reconstruction. We assume that lighting is well-explained as the sum of a view-independent diffuse component and a view-dependent glossy term concentrated around the mirror reflection direction. We design a convolutional network around input feature maps that facilitate learning of an implicit representation of scene materials and illumination, enabling both relighting and free-viewpoint navigation. We generate these input maps by exploiting the best elements of both image-based and physically-based rendering. We sample the input views to estimate diffuse scene irradiance, and compute the new illumination caused by user-specified light sources using path tracing. To facilitate the network's understanding of materials and synthesize plausible glossy reflections, we reproject the views and compute mirror images. We train the network on a synthetic dataset where each scene is also reconstructed with MVS. We show results of our algorithm relighting real indoor scenes and performing free-viewpoint navigation with complex and realistic glossy reflections, which so far remained out of reach for view-synthesis techniques

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

OutCast: Outdoor Single-image Relighting with Cast Shadows

Author: Griffiths David
Philip Julien
Ritschel Tobias
Publication venue
Publication date: 20/04/2022
Field of study

We propose a relighting method for outdoor images. Our method mainly focuses on predicting cast shadows in arbitrary novel lighting directions from a single image while also accounting for shading and global effects such the sun light color and clouds. Previous solutions for this problem rely on reconstructing occluder geometry, e.g. using multi-view stereo, which requires many images of the scene. Instead, in this work we make use of a noisy off-the-shelf single-image depth map estimation as a source of geometry. Whilst this can be a good guide for some lighting effects, the resulting depth map quality is insufficient for directly ray-tracing the shadows. Addressing this, we propose a learned image space ray-marching layer that converts the approximate depth map into a deep 3D representation that is fused into occlusion queries using a learned traversal. Our proposed method achieves, for the first time, state-of-the-art relighting results, with only a single image as input. For supplementary material visit our project page at: https://dgriffiths.uk/outcast.Comment: Eurographics 2022 - Accepte

arXiv.org e-Print Archive

UCL Discovery

Inverse rendering techniques for physically grounded image editing

Author: Karsch Kevin
Publication venue
Publication date
Field of study

From a single picture of a scene, people can typically grasp the spatial layout immediately and even make good guesses at materials properties and where light is coming from to illuminate the scene. For example, we can reliably tell which objects occlude others, what an object is made of and its rough shape, regions that are illuminated or in shadow, and so on. It is interesting how little is known about our ability to make these determinations; as such, we are still not able to robustly "teach" computers to make the same high-level observations as people. This document presents algorithms for understanding intrinsic scene properties from single images. The goal of these inverse rendering techniques is to estimate the configurations of scene elements (geometry, materials, luminaires, camera parameters, etc) using only information visible in an image. Such algorithms have applications in robotics and computer graphics. One such application is in physically grounded image editing: photo editing made easier by leveraging knowledge of the physical space. These applications allow sophisticated editing operations to be performed in a matter of seconds, enabling seamless addition, removal, or relocation of objects in images

Illinois Digital Environment for Access to Learning and Scholarship Repository

Adaptive Vision Based Scene Registration for Outdoor Augmented Reality

Author: Catchpole Jason James
Publication venue: The University of Waikato
Publication date: 01/01/2008
Field of study

Augmented Reality (AR) involves adding virtual content into real scenes. Scenes are viewed using a Head-Mounted Display or other display type. In order to place content into the user's view of a scene, the user's position and orientation relative to the scene, commonly referred to as their pose, must be determined accurately. This allows the objects to be placed in the correct positions and to remain there when the user moves or the scene changes. It is achieved by tracking the user in relation to their environment using a variety of technology. One technology which has proven to provide accurate results is computer vision. Computer vision involves a computer analysing images and achieving an understanding of them. This may be locating objects such as faces in the images, or in the case of AR, determining the pose of the user. One of the ultimate goals of AR systems is to be capable of operating under any condition. For example, a computer vision system must be robust under a range of different scene types, and under unpredictable environmental conditions due to variable illumination and weather. The majority of existing literature tests algorithms under the assumption of ideal or 'normal' imaging conditions. To ensure robustness under as many circumstances as possible it is also important to evaluate the systems under adverse conditions. This thesis seeks to analyse the effects that variable illumination has on computer vision algorithms. To enable this analysis, test data is required to isolate weather and illumination effects, without other factors such as changes in viewpoint that would bias the results. A new dataset is presented which also allows controlled viewpoint differences in the presence of weather and illumination changes. This is achieved by capturing video from a camera undergoing a repeatable motion sequence. Ground truth data is stored per frame allowing images from the same position under differing environmental conditions, to be easily extracted from the videos. An in depth analysis of six detection algorithms and five matching techniques demonstrates the impact that non-uniform illumination changes can have on vision algorithms. Specifically, shadows can degrade performance and reduce confidence in the system, decrease reliability, or even completely prevent successful operation. An investigation into approaches to improve performance yields techniques that can help reduce the impact of shadows. A novel algorithm is presented that merges reference data captured at different times, resulting in reference data with minimal shadow effects. This can significantly improve performance and reliability when operating on images containing shadow effects. These advances improve the robustness of computer vision systems and extend the range of conditions in which they can operate. This can increase the usefulness of the algorithms and the AR systems that employ them

Research Commons@Waikato

Applying Augmented Reality to Outdoors Industrial Use

Author: Forsman Mikko
Publication venue: fi=Turun yliopisto|en=University of Turku|
Publication date: 30/05/2016
Field of study

Augmented Reality (AR) is currently gaining popularity in multiple different fields. However, the technology for AR still requires development in both hardware and software when considering industrial use. In order to create immersive AR applications, more accurate pose estimation techniques to define virtual camera location are required. The algorithms for pose estimation often require a lot of processing power, which makes robust pose estimation a difficult task when using mobile devices or designated AR tools. The difficulties are even larger in outdoor scenarios where the environment can vary a lot and is often unprepared for AR. This thesis aims to research different possibilities for creating AR applications for outdoor environments. Both hardware and software solutions are considered, but the focus is more on software. The majority of the thesis focuses on different visual pose estimation and tracking techniques for natural features. During the thesis, multiple different solutions were tested for outdoor AR. One commercial AR SDK was tested, and three different custom software solutions were developed for an Android tablet. The custom software solutions were an algorithm for combining data from magnetometer and a gyroscope, a natural feature tracker and a tracker based on panorama images. The tracker based on panorama images was implemented based on an existing scientific publication, and the presented tracker was further developed by integrating it to Unity 3D and adding a possibility for augmenting content. This thesis concludes that AR is very close to becoming a usable tool for professional use. The commercial solutions currently available are not yet ready for creating tools for professional use, but especially for different visualization tasks some custom solutions are capable of achieving a required robustness. The panorama tracker implemented in this thesis seems like a promising tool for robust pose estimation in unprepared outdoor environments.Lisätyn todellisuuden suosio on tällä hetkellä kasvamassa usealla eri alalla. Saatavilla olevat ohjelmistot sekä laitteet eivät vielä riitä lisätyn todellisuuden soveltamiseen ammattimaisessa käytössä. Erityisesti posen estimointi vaatii tarkempia menetelmiä, jotta immersiivisten lisätyn todellisuuden sovellusten kehittäminen olisi mahdollista. Posen estimointiin (laitteen asennon- sekä paikan arviointiin) käytetyt algoritmit ovat usein monimutkaisia, joten ne vaativat merkittävästi laskentatehoa. Laskentatehon vaatimukset ovat usein haasteellisia varsinkin mobiililaitteita sekä lisätyn todellisuuden laitteita käytettäessä. Lisäongelmia tuottaa myös ulkotilat, jossa ympäristö voi muuttua usein ja ympäristöä ei ole valmisteltu lisätyn todellisuuden sovelluksille. Diplomityön tarkoituksena on tutkia mahdollisuuksia lisätyn todellisuuden sovellusten kehittämiseen ulkotiloihin. Sekä laitteisto- että ohjelmistopohjaisia ratkaisuja käsitellään. Ohjelmistopohjaisia ratkaisuja käsitellään työssä laitteistopohjaisia ratkaisuja laajemmin. Suurin osa diplomityöstä keskittyy erilaisiin visuaalisiin posen estimointi tekniikoihin, jotka perustuvat kuvasta tunnistettujen luonnollisten piirteiden seurantaan. Työn aikana testattiin useita ratkaisuja ulkotiloihin soveltuvaan lisättyyn todellisuuteen. Yhtä kaupallista työkalua testattiin, jonka lisäksi toteutettiin kolme omaa sovellusta Android tableteille. Työn aikana kehitetyt sovellukset olivat yksinkertainen algoritmi gyroskoopin ja magnetometrin datan yhdistämiseen, luonnollisen piirteiden seuranta-algoritmi sekä panoraamakuvaan perustuva seuranta-algoritmi. Panoraamakuvaan perustuva seuranta-algoritmi on toteuteutettu toisen tieteellisen julkaisun pohjalta, ja algoritmia jatkokehitettiin integroimalla se Unity 3D:hen. Unity 3D-integrointi mahdollisti myös sisällön esittämisen lisätyn todellisuuden avulla. Työn lopputuloksena todetaan, että lisätyn todellisuuden teknologia on lähellä pistettä, jossa lisätyn todellisuuden työkaluja voitaisiin käyttää ammattimaisessa käytössä. Tällä hetkellä saatavilla olevat kaupalliset työkalut eivät vielä pääse ammattikäytön vaatimalle tasolle, mutta erityisesti visualisointitehtäviin soveltuvia ei-kaupallisia ratkaisuja on jo olemassa. Lisäksi työn aikana toteutetun panoraamakuviin perustuvan seuranta-algoritmin todetaan olevan lupaava työkalu posen estimointiin ulkotiloissa.Siirretty Doriast

UTUPub

Learning geometric and lighting priors from natural images

Author: Hold-Geoffroy Yannick
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2018
Field of study

Comprendre les images est d’une importance cruciale pour une pléthore de tâches, de la composition numérique au ré-éclairage d’une image, en passant par la reconstruction 3D d’objets. Ces tâches permettent aux artistes visuels de réaliser des chef-d’oeuvres ou d’aider des opérateurs à prendre des décisions de façon sécuritaire en fonction de stimulis visuels. Pour beaucoup de ces tâches, les modèles physiques et géométriques que la communauté scientifique a développés donnent lieu à des problèmes mal posés possédant plusieurs solutions, dont généralement une seule est raisonnable. Pour résoudre ces indéterminations, le raisonnement sur le contexte visuel et sémantique d’une scène est habituellement relayé à un artiste ou un expert qui emploie son expérience pour réaliser son travail. Ceci est dû au fait qu’il est généralement nécessaire de raisonner sur la scène de façon globale afin d’obtenir des résultats plausibles et appréciables. Serait-il possible de modéliser l’expérience à partir de données visuelles et d’automatiser en partie ou en totalité ces tâches ? Le sujet de cette thèse est celui-ci : la modélisation d’a priori par apprentissage automatique profond pour permettre la résolution de problèmes typiquement mal posés. Plus spécifiquement, nous couvrirons trois axes de recherche, soient : 1) la reconstruction de surface par photométrie, 2) l’estimation d’illumination extérieure à partir d’une seule image et 3) l’estimation de calibration de caméra à partir d’une seule image avec un contenu générique. Ces trois sujets seront abordés avec une perspective axée sur les données. Chacun de ces axes comporte des analyses de performance approfondies et, malgré la réputation d’opacité des algorithmes d’apprentissage machine profonds, nous proposons des études sur les indices visuels captés par nos méthodes.Understanding images is needed for a plethora of tasks, from compositing to image relighting, including 3D object reconstruction. These tasks allow artists to realize masterpieces or help operators to safely make decisions based on visual stimuli. For many of these tasks, the physical and geometric models that the scientific community has developed give rise to ill-posed problems with several solutions, only one of which is generally reasonable. To resolve these indeterminations, the reasoning about the visual and semantic context of a scene is usually relayed to an artist or an expert who uses his experience to carry out his work. This is because humans are able to reason globally on the scene in order to obtain plausible and appreciable results. Would it be possible to model this experience from visual data and partly or totally automate tasks? This is the topic of this thesis: modeling priors using deep machine learning to solve typically ill-posed problems. More specifically, we will cover three research axes: 1) surface reconstruction using photometric cues, 2) outdoor illumination estimation from a single image and 3) camera calibration estimation from a single image with generic content. These three topics will be addressed from a data-driven perspective. Each of these axes includes in-depth performance analyses and, despite the reputation of opacity of deep machine learning algorithms, we offer studies on the visual cues captured by our methods

CorpusUL

Computer vision models in surveillance robotics

Author: Moro Alessandro
Publication venue: Università degli studi di Trieste
Publication date: 01/01/1980
Field of study

2009/2010In questa Tesi, abbiamo sviluppato algoritmi che usano l’informazione visiva per eseguire, in tempo reale, individuazione, riconoscimento e classificazione di oggetti in movimento, indipendentemente dalle condizioni ambientali e con l’accurattezza migliore. A tal fine, abbiamo sviluppato diversi concetti di visione artificial, cioè l'identificazione degli oggetti di interesse in tutta la scena visiva (monoculare o stereo), e la loro classificazione. Nel corso della ricerca, sono stati provati diversi approcci, inclusa l’individuazione di possibili candidati tramite la segmentazione di immagini con classificatori deboli e centroidi, algoritmi per la segmentazione di immagini rafforzate tramite informazioni stereo e riduzione del rumore, combinazione di popolari caratteristiche quali quelle invarianti a fattori di scala (SIFT) combinate con informazioni di distanza. Abbiamo sviluppato due grandi categorie di soluzioni associate al tipo di sistema usato. Con camera mobile, abbiamo favorito l’individuazione di oggetti conosciuti tramite scansione dell’immagine; con camera fissa abbiamo anche utilizzato algoritmi per l’individuazione degli oggetti in primo piano ed in movimento (foreground detection). Nel caso di “foreground detection”, il tasso di individuazione e classificazione aumenta se la qualita’ degli oggetti estratti e’ alta. Noi proponiamo metodi per ridurre gli effetti dell’ombra, illuminazione e movimenti ripetitivi prodotti dagli oggetti in movimento. Un aspetto importante studiato e’ la possibilita’ di usare algoritmi per l’individuazione di oggetti in movimento tramite camera mobile. Soluzioni efficienti stanno diventando sempre piu’ complesse, ma anche gli strumenti di calcolo per elaborare gli algoritmi sono piu’ potenti e negli anni recenti, le architetture delle schede video (GPU) offrono un grande potenziale. Abbiamo proposto una soluzione per architettura GPU di una gestione delle immagini di sfondo, al fine di aumentare le prestazioni di individuazione. In questa Tesi abbiamo studiato l’individuazione ed inseguimento di persone for applicazioni come la prevenzione di situazione di rischio (attraversamento delle strade), e conteggio per l’analisi del traffico. Noi abbiamo studiato questi problemi ed esplorato vari aspetti dell’individuazione delle persone, gruppi ed individuazione in scenari affollati. Comunque, in un ambiente generico, e’ impossibile predire la configurazione di oggetti che saranno catturati dalla telecamera. In questi casi, e’ richiesto di “astrarre il concetto” di oggetti. Con questo requisito in mente, abbiamo esplorato le proprieta’ dei metodi stocastici e mostrano che buoni tassi di classificazione possono essere ottenuti a condizione che l’insieme di addestramento sia abbastanza grande. Una struttura flessibile deve essere in grado di individuare le regioni in movimento e riconoscere gli oggetti di interesse. Abbiamo sviluppato una struttura per la gestione dei problemi di individuazione e classificazione. Rispetto ad altri metodi, i metodi proposti offrono una struttura flessibile per l’individuazione e classificazione degli oggetti, e che puo’ essere usata in modo efficiente in diversi ambienti interni ed esterni.XXII Cicl

OpenstarTs