6,509 research outputs found

    深層学習に基づく画像圧縮と品質評価

    Get PDF
    早大学位記番号:新8427早稲田大

    Efficient image-based rendering

    Get PDF
    Recent advancements in real-time ray tracing and deep learning have significantly enhanced the realism of computer-generated images. However, conventional 3D computer graphics (CG) can still be time-consuming and resource-intensive, particularly when creating photo-realistic simulations of complex or animated scenes. Image-based rendering (IBR) has emerged as an alternative approach that utilizes pre-captured images from the real world to generate realistic images in real-time, eliminating the need for extensive modeling. Although IBR has its advantages, it faces challenges in providing the same level of control over scene attributes as traditional CG pipelines and accurately reproducing complex scenes and objects with different materials, such as transparent objects. This thesis endeavors to address these issues by harnessing the power of deep learning and incorporating the fundamental principles of graphics and physical-based rendering. It offers an efficient solution that enables interactive manipulation of real-world dynamic scenes captured from sparse views, lighting positions, and times, as well as a physically-based approach that facilitates accurate reproduction of the view dependency effect resulting from the interaction between transparent objects and their surrounding environment. Additionally, this thesis develops a visibility metric that can identify artifacts in the reconstructed IBR images without observing the reference image, thereby contributing to the design of an effective IBR acquisition pipeline. Lastly, a perception-driven rendering technique is developed to provide high-fidelity visual content in virtual reality displays while retaining computational efficiency.Jüngste Fortschritte im Bereich Echtzeit-Raytracing und Deep Learning haben den Realismus computergenerierter Bilder erheblich verbessert. Konventionelle 3DComputergrafik (CG) kann jedoch nach wie vor zeit- und ressourcenintensiv sein, insbesondere bei der Erstellung fotorealistischer Simulationen von komplexen oder animierten Szenen. Das bildbasierte Rendering (IBR) hat sich als alternativer Ansatz herauskristallisiert, bei dem vorab aufgenommene Bilder aus der realen Welt verwendet werden, um realistische Bilder in Echtzeit zu erzeugen, so dass keine umfangreiche Modellierung erforderlich ist. Obwohl IBR seine Vorteile hat, ist es eine Herausforderung, das gleiche Maß an Kontrolle über Szenenattribute zu bieten wie traditionelle CG-Pipelines und komplexe Szenen und Objekte mit unterschiedlichen Materialien, wie z.B. transparente Objekte, akkurat wiederzugeben. In dieser Arbeit wird versucht, diese Probleme zu lösen, indem die Möglichkeiten des Deep Learning genutzt und die grundlegenden Prinzipien der Grafik und des physikalisch basierten Renderings einbezogen werden. Sie bietet eine effiziente Lösung, die eine interaktive Manipulation von dynamischen Szenen aus der realen Welt ermöglicht, die aus spärlichen Ansichten, Beleuchtungspositionen und Zeiten erfasst wurden, sowie einen physikalisch basierten Ansatz, der eine genaue Reproduktion des Effekts der Sichtabhängigkeit ermöglicht, der sich aus der Interaktion zwischen transparenten Objekten und ihrer Umgebung ergibt. Darüber hinaus wird in dieser Arbeit eine Sichtbarkeitsmetrik entwickelt, mit der Artefakte in den rekonstruierten IBR-Bildern identifiziert werden können, ohne das Referenzbild zu betrachten, und die somit zur Entwicklung einer effektiven IBR-Erfassungspipeline beiträgt. Schließlich wird ein wahrnehmungsgesteuertes Rendering-Verfahren entwickelt, um visuelle Inhalte in Virtual-Reality-Displays mit hoherWiedergabetreue zu liefern und gleichzeitig die Rechenleistung zu erhalten

    A perceptual model of motion quality for rendering with adaptive refresh-rate and resolution

    Get PDF
    Limited GPU performance budgets and transmission bandwidths mean that real-time rendering often has to compromise on the spatial resolution or temporal resolution (refresh rate). A common practice is to keep either the resolution or the refresh rate constant and dynamically control the other variable. But this strategy is non-optimal when the velocity of displayed content varies. To find the best trade-off between the spatial resolution and refresh rate, we propose a perceptual visual model that predicts the quality of motion given an object velocity and predictability of motion. The model considers two motion artifacts to establish an overall quality score: non-smooth (juddery) motion, and blur. Blur is modeled as a combined effect of eye motion, finite refresh rate and display resolution. To fit the free parameters of the proposed visual model, we measured eye movement for predictable and unpredictable motion, and conducted psychophysical experiments to measure the quality of motion from 50 Hz to 165 Hz. We demonstrate the utility of the model with our on-the-fly motion-adaptive rendering algorithm that adjusts the refresh rate of a G-Sync-capable monitor based on a given rendering budget and observed object motion. Our psychophysical validation experiments demonstrate that the proposed algorithm performs better than constant-refresh-rate solutions, showing that motion-adaptive rendering is an attractive technique for driving variable-refresh-rate displays.</jats:p

    Neural 3D Video Synthesis

    Full text link
    We propose a novel approach for 3D video synthesis that is able to represent multi-view video recordings of a dynamic real-world scene in a compact, yet expressive representation that enables high-quality view synthesis and motion interpolation. Our approach takes the high quality and compactness of static neural radiance fields in a new direction: to a model-free, dynamic setting. At the core of our approach is a novel time-conditioned neural radiance fields that represents scene dynamics using a set of compact latent codes. To exploit the fact that changes between adjacent frames of a video are typically small and locally consistent, we propose two novel strategies for efficient training of our neural network: 1) An efficient hierarchical training scheme, and 2) an importance sampling strategy that selects the next rays for training based on the temporal variation of the input videos. In combination, these two strategies significantly boost the training speed, lead to fast convergence of the training process, and enable high quality results. Our learned representation is highly compact and able to represent a 10 second 30 FPS multi-view video recording by 18 cameras with a model size of just 28MB. We demonstrate that our method can render high-fidelity wide-angle novel views at over 1K resolution, even for highly complex and dynamic scenes. We perform an extensive qualitative and quantitative evaluation that shows that our approach outperforms the current state of the art. We include additional video and information at: https://neural-3d-video.github.io/Comment: Project website: https://neural-3d-video.github.io

    Digital Color Imaging

    Full text link
    This paper surveys current technology and research in the area of digital color imaging. In order to establish the background and lay down terminology, fundamental concepts of color perception and measurement are first presented us-ing vector-space notation and terminology. Present-day color recording and reproduction systems are reviewed along with the common mathematical models used for representing these devices. Algorithms for processing color images for display and communication are surveyed, and a forecast of research trends is attempted. An extensive bibliography is provided

    Foveated Path Tracing with Fast Reconstruction and Efficient Sample Distribution

    Get PDF
    Polunseuranta on tietokonegrafiikan piirtotekniikka, jota on käytetty pääasiassa ei-reaaliaikaisen realistisen piirron tekemiseen. Polunseuranta tukee luonnostaan monia muilla tekniikoilla vaikeasti saavutettavia todellisen valon ilmiöitä kuten heijastuksia ja taittumista. Reaaliaikainen polunseuranta on hankalaa polunseurannan suuren laskentavaatimuksen takia. Siksi nykyiset reaaliaikaiset polunseurantasysteemi tuottavat erittäin kohinaisia kuvia, jotka tyypillisesti suodatetaan jälkikäsittelykohinanpoisto-suodattimilla. Erittäin immersiivisiä käyttäjäkokemuksia voitaisiin luoda polunseurannalla, joka täyttäisi laajennetun todellisuuden vaatimukset suuresta resoluutiosta riittävän matalassa vasteajassa. Yksi mahdollinen ratkaisu näiden vaatimusten täyttämiseen voisi olla katsekeskeinen polunseuranta, jossa piirron resoluutiota vähennetään katseen reunoilla. Tämän johdosta piirron laatu on katseen reunoilla sekä harvaa että kohinaista, mikä asettaa suuren roolin lopullisen kuvan koostavalle suodattimelle. Tässä työssä esitellään ensimmäinen reaaliajassa toimiva regressionsuodatin. Suodatin on suunniteltu kohinaisille kuville, joissa on yksi polunseurantanäyte pikseliä kohden. Nopea suoritus saavutetaan tiileissä käsittelemällä ja nopealla sovituksen toteutuksella. Lisäksi työssä esitellään Visual-Polar koordinaattiavaruus, joka jakaa polunseurantanäytteet siten, että niiden jakauma seuraa silmän herkkyysmallia. Visual-Polar-avaruuden etu muihin tekniikoiden nähden on että se vähentää työmäärää sekä polunseurannassa että suotimessa. Nämä tekniikat esittelevät toimivan prototyypin katsekeskeisestä polunseurannasta, ja saattavat toimia tienraivaajina laajamittaiselle realistisen reaaliaikaisen polunseurannan käyttöönotolle.Photo-realistic offline rendering is currently done with path tracing, because it naturally produces many real-life light effects such as reflections, refractions and caustics. These effects are hard to achieve with other rendering techniques. However, path tracing in real time is complicated due to its high computational demand. Therefore, current real-time path tracing systems can only generate very noisy estimate of the final frame, which is then denoised with a post-processing reconstruction filter. A path tracing-based rendering system capable of filling the high resolution in the low latency requirements of mixed reality devices would generate a very immersive user experience. One possible solution for fulfilling these requirements could be foveated path tracing, wherein the rendering resolution is reduced in the periphery of the human visual system. The key challenge is that the foveated path tracing in the periphery is both sparse and noisy, placing high demands on the reconstruction filter. This thesis proposes the first regression-based reconstruction filter for path tracing that runs in real time. The filter is designed for highly noisy one sample per pixel inputs. The fast execution is accomplished with blockwise processing and fast implementation of the regression. In addition, a novel Visual-Polar coordinate space which distributes the samples according to the contrast sensitivity model of the human visual system is proposed. The specialty of Visual-Polar space is that it reduces both path tracing and reconstruction work because both of them can be done with smaller resolution. These techniques enable a working prototype of a foveated path tracing system and may work as a stepping stone towards wider commercial adoption of photo-realistic real-time path tracing

    Image synthesis based on a model of human vision

    Get PDF
    Modern computer graphics systems are able to construct renderings of such high quality that viewers are deceived into regarding the images as coming from a photographic source. Large amounts of computing resources are expended in this rendering process, using complex mathematical models of lighting and shading. However, psychophysical experiments have revealed that viewers only regard certain informative regions within a presented image. Furthermore, it has been shown that these visually important regions contain low-level visual feature differences that attract the attention of the viewer. This thesis will present a new approach to image synthesis that exploits these experimental findings by modulating the spatial quality of image regions by their visual importance. Efficiency gains are therefore reaped, without sacrificing much of the perceived quality of the image. Two tasks must be undertaken to achieve this goal. Firstly, the design of an appropriate region-based model of visual importance, and secondly, the modification of progressive rendering techniques to effect an importance-based rendering approach. A rule-based fuzzy logic model is presented that computes, using spatial feature differences, the relative visual importance of regions in an image. This model improves upon previous work by incorporating threshold effects induced by global feature difference distributions and by using texture concentration measures. A modified approach to progressive ray-tracing is also presented. This new approach uses the visual importance model to guide the progressive refinement of an image. In addition, this concept of visual importance has been incorporated into supersampling, texture mapping and computer animation techniques. Experimental results are presented, illustrating the efficiency gains reaped from using this method of progressive rendering. This visual importance-based rendering approach is expected to have applications in the entertainment industry, where image fidelity may be sacrificed for efficiency purposes, as long as the overall visual impression of the scene is maintained. Different aspects of the approach should find many other applications in image compression, image retrieval, progressive data transmission and active robotic vision
    corecore