681 research outputs found
Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image
Image metrics predict the perceived per-pixel difference between a reference
image and its degraded (e. g., re-rendered) version. In several important
applications, the reference image is not available and image metrics cannot be
applied. We devise a neural network architecture and training procedure that
allows predicting the MSE, SSIM or VGG16 image difference from the distorted
image alone while the reference is not observed. This is enabled by two
insights: The first is to inject sufficiently many un-distorted natural image
patches, which can be found in arbitrary amounts and are known to have no
perceivable difference to themselves. This avoids false positives. The second
is to balance the learning, where it is carefully made sure that all image
errors are equally likely, avoiding false negatives. Surprisingly, we observe,
that the resulting no-reference metric, subjectively, can even perform better
than the reference-based one, as it had to become robust against
mis-alignments. We evaluate the effectiveness of our approach in an image-based
rendering context, both quantitatively and qualitatively. Finally, we
demonstrate two applications which reduce light field capture time and provide
guidance for interactive depth adjustment.Comment: 13 pages, 11 figure
RLFC: Random Access Light Field Compression using Key Views and Bounded Integer Encoding
We present a new hierarchical compression scheme for encoding light field
images (LFI) that is suitable for interactive rendering. Our method (RLFC)
exploits redundancies in the light field images by constructing a tree
structure. The top level (root) of the tree captures the common high-level
details across the LFI, and other levels (children) of the tree capture
specific low-level details of the LFI. Our decompressing algorithm corresponds
to tree traversal operations and gathers the values stored at different levels
of the tree. Furthermore, we use bounded integer sequence encoding which
provides random access and fast hardware decoding for compressing the blocks of
children of the tree. We have evaluated our method for 4D two-plane
parameterized light fields. The compression rates vary from 0.08 - 2.5 bits per
pixel (bpp), resulting in compression ratios of around 200:1 to 20:1 for a PSNR
quality of 40 to 50 dB. The decompression times for decoding the blocks of LFI
are 1 - 3 microseconds per channel on an NVIDIA GTX-960 and we can render new
views with a resolution of 512X512 at 200 fps. Our overall scheme is simple to
implement and involves only bit manipulations and integer arithmetic
operations.Comment: Accepted for publication at Symposium on Interactive 3D Graphics and
Games (I3D '19
Evaluation of learning-based techniques in novel view synthesis
Abstract. Novel view synthesis is a long-standing topic at the intersection of computer vision and computer graphics, where the fundamental goal is to synthesize an image from a novel viewpoint given a sparse set of reference images. The rapid development of deep learning has introduced a wide range of new ideas and methods in novel view synthesis where parts of the synthesis process are considered as a supervised learning problem. Specifically, neural scene representations paired with volume rendering have achieved state of the art results in novel view synthesis, but still remains a nascent field facing a lack of literature.
This thesis presents an overview of learning-based view synthesis, experiments with state-of-the-art view synthesis methods, evaluates them quantitatively and qualitatively and finally discusses their properties. Furthermore, we introduce a novel multi-view stereo dataset captured with a hand-held camera and demonstrate the process of collecting and preparing multi-view stereo datasets for view synthesis.
The findings in this thesis indicate that learning-based view synthesis methods excel at synthesizing plausible views from challenging scenes, including situations with complex geometry as well as transparent and reflective materials. Furthermore, we found that it is possible to render such scenes in real-time and greatly reduce the time to prepare a scene for view synthesis by using a pre-trained network that aggregates information from nearby views.Koneoppimisen soveltaminen uuden näkymän synteesissä. Tiivistelmä. Uuden näkymän synteesi on pitkäaikainen aihe konenäön ja tietokonegrafiikan risteyksessä, jossa tavoitteena on syntetisoida kuva uudesta näkökulmasta annetun kuvajoukon perusteella. Syväoppimisen nopea kehitys on synnyttänyt laajan kirjon uusia ideoita ja menetelmiä uuden näkymän synteesissä, jossa osia synteesiprosessista pidetään valvottuna oppimisongelmana. Erityisesti neuraaliset tilaesitykset yhdistettynä tilavuusrenderointiin ovat saavuttaneet huippuluokan tuloksia uuden näkymän synteesissä, mutta aihe on vielä kehittyvä tieteenala.
Tässä opinnäytetyössä esitetään yleiskatsaus oppimispohjaiseen näkymän synteesiin, suoritetaan kokeellista tutkimusta uusimmilla synteesimenetelmillä, arvioidaan niitä kvantitatiivisesti ja kvalitatiivisesti sekä lopuksi keskustellaan niiden ominaisuuksista. Lisäksi esitellään uusi stereokuvien muodostama tietoainesto ja esitetään prosessi, jolla kerätään ja valmistellaan kyseisiä tietoaineistoja näkymän synteesiä varten.
Työssä havaitaan, että oppimispohjaiset näkymäsynteesimenetelmät piirtävät erittäin aidolta näyttäviä näkymiä tietoaineiston pohjalta jopa tilanteissa, missä esiintyy monimutkaista geometriaa sekä läpinäkyviä ja heijastavia materiaaleja. Lisäksi havaitsimme, että syntetisointi on mahdollista suorittaa reaaliajassa ja että syntetisoinnin valmisteluaikaa voidaan myös lyhentää käyttämällä ennalta koulutettua verkkoa, joka kokoaa tietoja läheisistä näkymistä
Efficient image-based rendering
Recent advancements in real-time ray tracing and deep learning have significantly enhanced the realism of computer-generated images. However, conventional 3D computer graphics (CG) can still be time-consuming and resource-intensive, particularly when creating photo-realistic simulations of complex or animated scenes. Image-based rendering (IBR) has emerged as an alternative approach that utilizes pre-captured images from the real world to generate realistic images in real-time, eliminating the need for extensive modeling. Although IBR has its advantages, it faces challenges in providing the same level of control over scene attributes as traditional CG pipelines and accurately reproducing complex scenes and objects with different materials, such as transparent objects. This thesis endeavors to address these issues by harnessing the power of deep learning and incorporating the fundamental principles of graphics and physical-based rendering. It offers an efficient solution that enables interactive manipulation of real-world dynamic scenes captured from sparse views, lighting positions, and times, as well as a physically-based approach that facilitates accurate reproduction of the view dependency effect resulting from the interaction between transparent objects and their surrounding environment. Additionally, this thesis develops a visibility metric that can identify artifacts in the reconstructed IBR images without observing the reference image, thereby contributing to the design of an effective IBR acquisition pipeline. Lastly, a perception-driven rendering technique is developed to provide high-fidelity visual content in virtual reality displays while retaining computational efficiency.Jüngste Fortschritte im Bereich Echtzeit-Raytracing und Deep Learning haben den Realismus computergenerierter Bilder erheblich verbessert. Konventionelle 3DComputergrafik (CG) kann jedoch nach wie vor zeit- und ressourcenintensiv sein, insbesondere bei der Erstellung fotorealistischer Simulationen von komplexen oder animierten Szenen. Das bildbasierte Rendering (IBR) hat sich als alternativer Ansatz herauskristallisiert, bei dem vorab aufgenommene Bilder aus der realen Welt verwendet werden, um realistische Bilder in Echtzeit zu erzeugen, so dass keine umfangreiche Modellierung erforderlich ist. Obwohl IBR seine Vorteile hat, ist es eine Herausforderung, das gleiche Maß an Kontrolle über Szenenattribute zu bieten wie traditionelle CG-Pipelines und komplexe Szenen und Objekte mit unterschiedlichen Materialien, wie z.B. transparente Objekte, akkurat wiederzugeben. In dieser Arbeit wird versucht, diese Probleme zu lösen, indem die Möglichkeiten des Deep Learning genutzt und die grundlegenden Prinzipien der Grafik und des physikalisch basierten Renderings einbezogen werden. Sie bietet eine effiziente Lösung, die eine interaktive Manipulation von dynamischen Szenen aus der realen Welt ermöglicht, die aus spärlichen Ansichten, Beleuchtungspositionen und Zeiten erfasst wurden, sowie einen physikalisch basierten Ansatz, der eine genaue Reproduktion des Effekts der Sichtabhängigkeit ermöglicht, der sich aus der Interaktion zwischen transparenten Objekten und ihrer Umgebung ergibt. Darüber hinaus wird in dieser Arbeit eine Sichtbarkeitsmetrik entwickelt, mit der Artefakte in den rekonstruierten IBR-Bildern identifiziert werden können, ohne das Referenzbild zu betrachten, und die somit zur Entwicklung einer effektiven IBR-Erfassungspipeline beiträgt. Schließlich wird ein wahrnehmungsgesteuertes Rendering-Verfahren entwickelt, um visuelle Inhalte in Virtual-Reality-Displays mit hoherWiedergabetreue zu liefern und gleichzeitig die Rechenleistung zu erhalten
- …