681 research outputs found

    Learning to Predict Image-based Rendering Artifacts with Respect to a Hidden Reference Image

    Full text link
    Image metrics predict the perceived per-pixel difference between a reference image and its degraded (e. g., re-rendered) version. In several important applications, the reference image is not available and image metrics cannot be applied. We devise a neural network architecture and training procedure that allows predicting the MSE, SSIM or VGG16 image difference from the distorted image alone while the reference is not observed. This is enabled by two insights: The first is to inject sufficiently many un-distorted natural image patches, which can be found in arbitrary amounts and are known to have no perceivable difference to themselves. This avoids false positives. The second is to balance the learning, where it is carefully made sure that all image errors are equally likely, avoiding false negatives. Surprisingly, we observe, that the resulting no-reference metric, subjectively, can even perform better than the reference-based one, as it had to become robust against mis-alignments. We evaluate the effectiveness of our approach in an image-based rendering context, both quantitatively and qualitatively. Finally, we demonstrate two applications which reduce light field capture time and provide guidance for interactive depth adjustment.Comment: 13 pages, 11 figure

    RLFC: Random Access Light Field Compression using Key Views and Bounded Integer Encoding

    Full text link
    We present a new hierarchical compression scheme for encoding light field images (LFI) that is suitable for interactive rendering. Our method (RLFC) exploits redundancies in the light field images by constructing a tree structure. The top level (root) of the tree captures the common high-level details across the LFI, and other levels (children) of the tree capture specific low-level details of the LFI. Our decompressing algorithm corresponds to tree traversal operations and gathers the values stored at different levels of the tree. Furthermore, we use bounded integer sequence encoding which provides random access and fast hardware decoding for compressing the blocks of children of the tree. We have evaluated our method for 4D two-plane parameterized light fields. The compression rates vary from 0.08 - 2.5 bits per pixel (bpp), resulting in compression ratios of around 200:1 to 20:1 for a PSNR quality of 40 to 50 dB. The decompression times for decoding the blocks of LFI are 1 - 3 microseconds per channel on an NVIDIA GTX-960 and we can render new views with a resolution of 512X512 at 200 fps. Our overall scheme is simple to implement and involves only bit manipulations and integer arithmetic operations.Comment: Accepted for publication at Symposium on Interactive 3D Graphics and Games (I3D '19

    Evaluation of learning-based techniques in novel view synthesis

    Get PDF
    Abstract. Novel view synthesis is a long-standing topic at the intersection of computer vision and computer graphics, where the fundamental goal is to synthesize an image from a novel viewpoint given a sparse set of reference images. The rapid development of deep learning has introduced a wide range of new ideas and methods in novel view synthesis where parts of the synthesis process are considered as a supervised learning problem. Specifically, neural scene representations paired with volume rendering have achieved state of the art results in novel view synthesis, but still remains a nascent field facing a lack of literature. This thesis presents an overview of learning-based view synthesis, experiments with state-of-the-art view synthesis methods, evaluates them quantitatively and qualitatively and finally discusses their properties. Furthermore, we introduce a novel multi-view stereo dataset captured with a hand-held camera and demonstrate the process of collecting and preparing multi-view stereo datasets for view synthesis. The findings in this thesis indicate that learning-based view synthesis methods excel at synthesizing plausible views from challenging scenes, including situations with complex geometry as well as transparent and reflective materials. Furthermore, we found that it is possible to render such scenes in real-time and greatly reduce the time to prepare a scene for view synthesis by using a pre-trained network that aggregates information from nearby views.Koneoppimisen soveltaminen uuden näkymän synteesissä. Tiivistelmä. Uuden näkymän synteesi on pitkäaikainen aihe konenäön ja tietokonegrafiikan risteyksessä, jossa tavoitteena on syntetisoida kuva uudesta näkökulmasta annetun kuvajoukon perusteella. Syväoppimisen nopea kehitys on synnyttänyt laajan kirjon uusia ideoita ja menetelmiä uuden näkymän synteesissä, jossa osia synteesiprosessista pidetään valvottuna oppimisongelmana. Erityisesti neuraaliset tilaesitykset yhdistettynä tilavuusrenderointiin ovat saavuttaneet huippuluokan tuloksia uuden näkymän synteesissä, mutta aihe on vielä kehittyvä tieteenala. Tässä opinnäytetyössä esitetään yleiskatsaus oppimispohjaiseen näkymän synteesiin, suoritetaan kokeellista tutkimusta uusimmilla synteesimenetelmillä, arvioidaan niitä kvantitatiivisesti ja kvalitatiivisesti sekä lopuksi keskustellaan niiden ominaisuuksista. Lisäksi esitellään uusi stereokuvien muodostama tietoainesto ja esitetään prosessi, jolla kerätään ja valmistellaan kyseisiä tietoaineistoja näkymän synteesiä varten. Työssä havaitaan, että oppimispohjaiset näkymäsynteesimenetelmät piirtävät erittäin aidolta näyttäviä näkymiä tietoaineiston pohjalta jopa tilanteissa, missä esiintyy monimutkaista geometriaa sekä läpinäkyviä ja heijastavia materiaaleja. Lisäksi havaitsimme, että syntetisointi on mahdollista suorittaa reaaliajassa ja että syntetisoinnin valmisteluaikaa voidaan myös lyhentää käyttämällä ennalta koulutettua verkkoa, joka kokoaa tietoja läheisistä näkymistä

    Efficient image-based rendering

    Get PDF
    Recent advancements in real-time ray tracing and deep learning have significantly enhanced the realism of computer-generated images. However, conventional 3D computer graphics (CG) can still be time-consuming and resource-intensive, particularly when creating photo-realistic simulations of complex or animated scenes. Image-based rendering (IBR) has emerged as an alternative approach that utilizes pre-captured images from the real world to generate realistic images in real-time, eliminating the need for extensive modeling. Although IBR has its advantages, it faces challenges in providing the same level of control over scene attributes as traditional CG pipelines and accurately reproducing complex scenes and objects with different materials, such as transparent objects. This thesis endeavors to address these issues by harnessing the power of deep learning and incorporating the fundamental principles of graphics and physical-based rendering. It offers an efficient solution that enables interactive manipulation of real-world dynamic scenes captured from sparse views, lighting positions, and times, as well as a physically-based approach that facilitates accurate reproduction of the view dependency effect resulting from the interaction between transparent objects and their surrounding environment. Additionally, this thesis develops a visibility metric that can identify artifacts in the reconstructed IBR images without observing the reference image, thereby contributing to the design of an effective IBR acquisition pipeline. Lastly, a perception-driven rendering technique is developed to provide high-fidelity visual content in virtual reality displays while retaining computational efficiency.Jüngste Fortschritte im Bereich Echtzeit-Raytracing und Deep Learning haben den Realismus computergenerierter Bilder erheblich verbessert. Konventionelle 3DComputergrafik (CG) kann jedoch nach wie vor zeit- und ressourcenintensiv sein, insbesondere bei der Erstellung fotorealistischer Simulationen von komplexen oder animierten Szenen. Das bildbasierte Rendering (IBR) hat sich als alternativer Ansatz herauskristallisiert, bei dem vorab aufgenommene Bilder aus der realen Welt verwendet werden, um realistische Bilder in Echtzeit zu erzeugen, so dass keine umfangreiche Modellierung erforderlich ist. Obwohl IBR seine Vorteile hat, ist es eine Herausforderung, das gleiche Maß an Kontrolle über Szenenattribute zu bieten wie traditionelle CG-Pipelines und komplexe Szenen und Objekte mit unterschiedlichen Materialien, wie z.B. transparente Objekte, akkurat wiederzugeben. In dieser Arbeit wird versucht, diese Probleme zu lösen, indem die Möglichkeiten des Deep Learning genutzt und die grundlegenden Prinzipien der Grafik und des physikalisch basierten Renderings einbezogen werden. Sie bietet eine effiziente Lösung, die eine interaktive Manipulation von dynamischen Szenen aus der realen Welt ermöglicht, die aus spärlichen Ansichten, Beleuchtungspositionen und Zeiten erfasst wurden, sowie einen physikalisch basierten Ansatz, der eine genaue Reproduktion des Effekts der Sichtabhängigkeit ermöglicht, der sich aus der Interaktion zwischen transparenten Objekten und ihrer Umgebung ergibt. Darüber hinaus wird in dieser Arbeit eine Sichtbarkeitsmetrik entwickelt, mit der Artefakte in den rekonstruierten IBR-Bildern identifiziert werden können, ohne das Referenzbild zu betrachten, und die somit zur Entwicklung einer effektiven IBR-Erfassungspipeline beiträgt. Schließlich wird ein wahrnehmungsgesteuertes Rendering-Verfahren entwickelt, um visuelle Inhalte in Virtual-Reality-Displays mit hoherWiedergabetreue zu liefern und gleichzeitig die Rechenleistung zu erhalten

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF
    corecore