456 research outputs found

    Deep-learning the Latent Space of Light Transport

    Get PDF
    We suggest a method to directly deep‐learn light transport, i. e., the mapping from a 3D geometry‐illumination‐material configuration to a shaded 2D image. While many previous learning methods have employed 2D convolutional neural networks applied to images, we show for the first time that light transport can be learned directly in 3D. The benefit of 3D over 2D is, that the former can also correctly capture illumination effects related to occluded and/or semi‐transparent geometry. To learn 3D light transport, we represent the 3D scene as an unstructured 3D point cloud, which is later, during rendering, projected to the 2D output image. Thus, we suggest a two‐stage operator comprising a 3D network that first transforms the point cloud into a latent representation, which is later on projected to the 2D output image using a dedicated 3D‐2D network in a second step. We will show that our approach results in improved quality in terms of temporal coherence while retaining most of the computational efficiency of common 2D methods. As a consequence, the proposed two stage‐operator serves as a valuable extension to modern deferred shading approaches

    A Light Source Calibration Technique for Multi-camera Inspection Devices

    Get PDF
    Industrial manufacturing processes often involve a visual control system to detect possible product defects during production. Such inspection devices usually include one or more cameras and several light sources designed to highlight surface imperfections under different illumination conditions (e.g. bumps, scratches, holes). In such scenarios, a preliminary calibration procedure of each component is a mandatory step to recover the system’s geometrical configuration and thus ensure a good process accuracy. In this paper we propose a procedure to estimate the position of each light source with respect to a camera network using an inexpensive Lambertian spherical target. For each light source, the target is acquired at different positions from different cameras, and an initial guess of the corresponding light vector is recovered from the analysis of the collected intensity isocurves. Then, an energy minimization process based on the Lambertian shading model refines the result for a pr ecise 3D localization. We tested our approach in an industrial setup, performing extensive experiments on synthetic and real-world data to demonstrate the accuracy of the proposed approach

    GMLight: Lighting Estimation via Geometric Distribution Approximation

    Full text link
    Lighting estimation from a single image is an essential yet challenging task in computer vision and computer graphics. Existing works estimate lighting by regressing representative illumination parameters or generating illumination maps directly. However, these methods often suffer from poor accuracy and generalization. This paper presents Geometric Mover's Light (GMLight), a lighting estimation framework that employs a regression network and a generative projector for effective illumination estimation. We parameterize illumination scenes in terms of the geometric light distribution, light intensity, ambient term, and auxiliary depth, and estimate them as a pure regression task. Inspired by the earth mover's distance, we design a novel geometric mover's loss to guide the accurate regression of light distribution parameters. With the estimated lighting parameters, the generative projector synthesizes panoramic illumination maps with realistic appearance and frequency. Extensive experiments show that GMLight achieves accurate illumination estimation and superior fidelity in relighting for 3D object insertion.Comment: 12 pages, 11 figures. arXiv admin note: text overlap with arXiv:2012.1111

    HOLOGRAPHICS: Combining Holograms with Interactive Computer Graphics

    Get PDF
    Among all imaging techniques that have been invented throughout the last decades, computer graphics is one of the most successful tools today. Many areas in science, entertainment, education, and engineering would be unimaginable without the aid of 2D or 3D computer graphics. The reason for this success story might be its interactivity, which is an important property that is still not provided efficiently by competing technologies – such as holography. While optical holography and digital holography are limited to presenting a non-interactive content, electroholography or computer generated holograms (CGH) facilitate the computer-based generation and display of holograms at interactive rates [2,3,29,30]. Holographic fringes can be computed by either rendering multiple perspective images, then combining them into a stereogram [4], or simulating the optical interference and calculating the interference pattern [5]. Once computed, such a system dynamically visualizes the fringes with a holographic display. Since creating an electrohologram requires processing, transmitting, and storing a massive amount of data, today’s computer technology still sets the limits for electroholography. To overcome some of these performance issues, advanced reduction and compression methods have been developed that create truly interactive electroholograms. Unfortunately, most of these holograms are relatively small, low resolution, and cover only a small color spectrum. However, recent advances in consumer graphics hardware may reveal potential acceleration possibilities that can overcome these limitations [6]. In parallel to the development of computer graphics and despite their non-interactivity, optical and digital holography have created new fields, including interferometry, copy protection, data storage, holographic optical elements, and display holograms. Especially display holography has conquered several application domains. Museum exhibits often use optical holograms because they can present 3D objects with almost no loss in visual quality. In contrast to most stereoscopic or autostereoscopic graphics displays, holographic images can provide all depth cues—perspective, binocular disparity, motion parallax, convergence, and accommodation—and theoretically can be viewed simultaneously from an unlimited number of positions. Displaying artifacts virtually removes the need to build physical replicas of the original objects. In addition, optical holograms can be used to make engineering, medical, dental, archaeological, and other recordings—for teaching, training, experimentation and documentation. Archaeologists, for example, use optical holograms to archive and investigate ancient artifacts [7,8]. Scientists can use hologram copies to perform their research without having access to the original artifacts or settling for inaccurate replicas. Optical holograms can store a massive amount of information on a thin holographic emulsion. This technology can record and reconstruct a 3D scene with almost no loss in quality. Natural color holographic silver halide emulsion with grain sizes of 8nm is today’s state-of-the-art [14]. Today, computer graphics and raster displays offer a megapixel resolution and the interactive rendering of megabytes of data. Optical holograms, however, provide a terapixel resolution and are able to present an information content in the range of terabytes in real-time. Both are dimensions that will not be reached by computer graphics and conventional displays within the next years – even if Moore’s law proves to hold in future. Obviously, one has to make a decision between interactivity and quality when choosing a display technology for a particular application. While some applications require high visual realism and real-time presentation (that cannot be provided by computer graphics), others depend on user interaction (which is not possible with optical and digital holograms). Consequently, holography and computer graphics are being used as tools to solve individual research, engineering, and presentation problems within several domains. Up until today, however, these tools have been applied separately. The intention of the project which is summarized in this chapter is to combine both technologies to create a powerful tool for science, industry and education. This has been referred to as HoloGraphics. Several possibilities have been investigated that allow merging computer generated graphics and holograms [1]. The goal is to combine the advantages of conventional holograms (i.e. extremely high visual quality and realism, support for all depth queues and for multiple observers at no computational cost, space efficiency, etc.) with the advantages of today’s computer graphics capabilities (i.e. interactivity, real-time rendering, simulation and animation, stereoscopic and autostereoscopic presentation, etc.). The results of these investigations are presented in this chapter

    Physically-Based Editing of Indoor Scene Lighting from a Single Image

    Full text link
    We present a method to edit complex indoor lighting from a single image with its predicted depth and light source segmentation masks. This is an extremely challenging problem that requires modeling complex light transport, and disentangling HDR lighting from material and geometry with only a partial LDR observation of the scene. We tackle this problem using two novel components: 1) a holistic scene reconstruction method that estimates scene reflectance and parametric 3D lighting, and 2) a neural rendering framework that re-renders the scene from our predictions. We use physically-based indoor light representations that allow for intuitive editing, and infer both visible and invisible light sources. Our neural rendering framework combines physically-based direct illumination and shadow rendering with deep networks to approximate global illumination. It can capture challenging lighting effects, such as soft shadows, directional lighting, specular materials, and interreflections. Previous single image inverse rendering methods usually entangle scene lighting and geometry and only support applications like object insertion. Instead, by combining parametric 3D lighting estimation with neural scene rendering, we demonstrate the first automatic method to achieve full scene relighting, including light source insertion, removal, and replacement, from a single image. All source code and data will be publicly released

    ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation

    Full text link
    Score distillation sampling (SDS) has shown great promise in text-to-3D generation by distilling pretrained large-scale text-to-image diffusion models, but suffers from over-saturation, over-smoothing, and low-diversity problems. In this work, we propose to model the 3D parameter as a random variable instead of a constant as in SDS and present variational score distillation (VSD), a principled particle-based variational framework to explain and address the aforementioned issues in text-to-3D generation. We show that SDS is a special case of VSD and leads to poor samples with both small and large CFG weights. In comparison, VSD works well with various CFG weights as ancestral sampling from diffusion models and simultaneously improves the diversity and sample quality with a common CFG weight (i.e., 7.57.5). We further present various improvements in the design space for text-to-3D such as distillation time schedule and density initialization, which are orthogonal to the distillation algorithm yet not well explored. Our overall approach, dubbed ProlificDreamer, can generate high rendering resolution (i.e., 512×512512\times512) and high-fidelity NeRF with rich structure and complex effects (e.g., smoke and drops). Further, initialized from NeRF, meshes fine-tuned by VSD are meticulously detailed and photo-realistic. Project page and codes: https://ml.cs.tsinghua.edu.cn/prolificdreamer/Comment: NeurIPS 2023 (Spotlight

    HOLOGRAPHICS: Combining Holograms with Interactive Computer Graphics

    Get PDF
    Among all imaging techniques that have been invented throughout the last decades, computer graphics is one of the most successful tools today. Many areas in science, entertainment, education, and engineering would be unimaginable without the aid of 2D or 3D computer graphics. The reason for this success story might be its interactivity, which is an important property that is still not provided efficiently by competing technologies – such as holography. While optical holography and digital holography are limited to presenting a non-interactive content, electroholography or computer generated holograms (CGH) facilitate the computer-based generation and display of holograms at interactive rates [2,3,29,30]. Holographic fringes can be computed by either rendering multiple perspective images, then combining them into a stereogram [4], or simulating the optical interference and calculating the interference pattern [5]. Once computed, such a system dynamically visualizes the fringes with a holographic display. Since creating an electrohologram requires processing, transmitting, and storing a massive amount of data, today’s computer technology still sets the limits for electroholography. To overcome some of these performance issues, advanced reduction and compression methods have been developed that create truly interactive electroholograms. Unfortunately, most of these holograms are relatively small, low resolution, and cover only a small color spectrum. However, recent advances in consumer graphics hardware may reveal potential acceleration possibilities that can overcome these limitations [6]. In parallel to the development of computer graphics and despite their non-interactivity, optical and digital holography have created new fields, including interferometry, copy protection, data storage, holographic optical elements, and display holograms. Especially display holography has conquered several application domains. Museum exhibits often use optical holograms because they can present 3D objects with almost no loss in visual quality. In contrast to most stereoscopic or autostereoscopic graphics displays, holographic images can provide all depth cues—perspective, binocular disparity, motion parallax, convergence, and accommodation—and theoretically can be viewed simultaneously from an unlimited number of positions. Displaying artifacts virtually removes the need to build physical replicas of the original objects. In addition, optical holograms can be used to make engineering, medical, dental, archaeological, and other recordings—for teaching, training, experimentation and documentation. Archaeologists, for example, use optical holograms to archive and investigate ancient artifacts [7,8]. Scientists can use hologram copies to perform their research without having access to the original artifacts or settling for inaccurate replicas. Optical holograms can store a massive amount of information on a thin holographic emulsion. This technology can record and reconstruct a 3D scene with almost no loss in quality. Natural color holographic silver halide emulsion with grain sizes of 8nm is today’s state-of-the-art [14]. Today, computer graphics and raster displays offer a megapixel resolution and the interactive rendering of megabytes of data. Optical holograms, however, provide a terapixel resolution and are able to present an information content in the range of terabytes in real-time. Both are dimensions that will not be reached by computer graphics and conventional displays within the next years – even if Moore’s law proves to hold in future. Obviously, one has to make a decision between interactivity and quality when choosing a display technology for a particular application. While some applications require high visual realism and real-time presentation (that cannot be provided by computer graphics), others depend on user interaction (which is not possible with optical and digital holograms). Consequently, holography and computer graphics are being used as tools to solve individual research, engineering, and presentation problems within several domains. Up until today, however, these tools have been applied separately. The intention of the project which is summarized in this chapter is to combine both technologies to create a powerful tool for science, industry and education. This has been referred to as HoloGraphics. Several possibilities have been investigated that allow merging computer generated graphics and holograms [1]. The goal is to combine the advantages of conventional holograms (i.e. extremely high visual quality and realism, support for all depth queues and for multiple observers at no computational cost, space efficiency, etc.) with the advantages of today’s computer graphics capabilities (i.e. interactivity, real-time rendering, simulation and animation, stereoscopic and autostereoscopic presentation, etc.). The results of these investigations are presented in this chapter

    Media Space: an analysis of spatial practices in planar pictorial media.

    Get PDF
    The thesis analyses the visual space displayed in pictures, film, television and digital interactive media. The argument is developed that depictions are informed by the objectives of the artefact as much as by any simple visual correspondence to the observed world. The simple concept of ‘realism’ is therefore anatomised and a more pragmatic theory proposed which resolves some of the traditional controversies concerning the relation between depiction and vision. This is then applied to the special problems of digital interactive media. An introductory chapter outlines the topic area and the main argument and provides an initial definition of terms. To provide a foundation for the ensuing arguments, a brief account is given of two existing and contrasted approaches to the notion of space: that of perception science which gives priority to acultural aspects, and that of visual culture which emphasises aspects which are culturally contingent. An existing approach to spatial perception (that of JJ Gibson originating in the 1940s and 50s) is applied to spatial depiction in order to explore the differences between seeing and picturing, and also to emphasise the many different cues for spatial perception beyond those commonly considered (such as binocularity and linear perspective). At this stage a simple framework of depiction is introduced which identifies five components or phases: the objectives of the picture, the idea chosen to embody the objectives, the model (essentially, the visual ‘subject matter’), the characteristics of the view and finally the substantive picture or depiction itself. This framework draws attention to the way in which each of the five phases presents an opportunity for decision-making about representation. The framework is used and refined throughout the thesis. Since pictures are considered in some everyday sense to be ‘realistic’ (otherwise, in terms of this thesis, they would not count as depictions), the nature of realism is considered at some length. The apparently unitary concept is broken down into several different types of realism and it is argued that, like the different spatial cues, each lends itself to particular objectives intended for the artefact. From these several types, two approaches to realism are identified, one prioritising the creation of a true illusion (that the picture is in fact a scene) and the other (of which there are innumerably more examples both across cultures and over historical time) one which evokes aspects of vision without aiming to exactly imitate the optical stimulus of the scene. Various reasons for the latter approach, and the variety of spatial practices to which it leads, are discussed. In addition to analysing traditional pictures, computer graphics images are discussed in conjunction with the claims for realism offered by their authors. In the process, informational and affective aspects of picture-making are distinguished, a distinction which it is argued is useful and too seldom made. Discussion of still pictures identifies the evocation of movement (and other aspects of time) as one of the principal motives for departing from attempts at straightforward optical matching. The discussion proceeds to the subject of film where, perhaps surprisingly now that the depiction of movement is possible, the lack of straightforward imitation of the optical is noteworthy again. This is especially true of the relationship between shots rather than within them; the reasons for this are analysed. This reinforces the argument that the spatial form of the fiction film, like that of other kinds of depiction, arises from its objectives, presenting realism once again as a contingent concept. The separation of depiction into two broad classes – one which aims to negate its own mediation, to seem transparent to what it depicts, and one which presents the fact of depiction ostensively to the viewer – is carried through from still pictures, via film, into a discussion of factual television and finally of digital interactive media. The example of factual television is chosen to emphasise how, despite the similarities between the technologies of film and television, spatial practices within some television genres contrast strongly with those of the mainstream fiction film. By considering historic examples, it is shown that many of the spatial practices now familiar in factual television were gradually expunged from the classical film when the latter became centred on the concerns of narrative fiction. By situating the spaces of interactive media in the context of other kinds of pictorial space, questions are addressed concerning the transferability of spatial usages from traditional media to those which are interactive. During the thesis the spatial practices of still-picture-making, film and television are characterised as ‘mature’ and ‘expressive’ (terms which are defined in the text). By contrast the spatial practices of digital interactive media are seen to be immature and inexpressive. It is argued that this is to some degree inevitable given the context in which interactive media artefacts are made and experienced – the lack of a shared ‘language’ or languages in any new media. Some of the difficult spatial problems which digital interactive media need to overcome are identified, especially where, as is currently normal, interaction is based on the relation between a pointer and visible objects within a depiction. The range of existing practice in digital interactive media is classified in a seven-part taxonomy, which again makes use of the objective-idea-model-view-picture framework, and again draws out the difference between self-concealing approaches to depiction and those which offer awareness of depiction as a significant component of the experience. The analysis indicates promising lines of enquiry for the future and emphasises the need for further innovation. Finally the main arguments are summarised and the thesis concludes with a short discussion of the implications for design arising from the key concepts identified – expressivity and maturity, pragmatism and realism

    Media Space: an analysis of spatial practices in planar pictorial media.

    Get PDF
    The thesis analyses the visual space displayed in pictures, film, television and digital interactive media. The argument is developed that depictions are informed by the objectives of the artefact as much as by any simple visual correspondence to the observed world. The simple concept of ‘realism’ is therefore anatomised and a more pragmatic theory proposed which resolves some of the traditional controversies concerning the relation between depiction and vision. This is then applied to the special problems of digital interactive media. An introductory chapter outlines the topic area and the main argument and provides an initial definition of terms. To provide a foundation for the ensuing arguments, a brief account is given of two existing and contrasted approaches to the notion of space: that of perception science which gives priority to acultural aspects, and that of visual culture which emphasises aspects which are culturally contingent. An existing approach to spatial perception (that of JJ Gibson originating in the 1940s and 50s) is applied to spatial depiction in order to explore the differences between seeing and picturing, and also to emphasise the many different cues for spatial perception beyond those commonly considered (such as binocularity and linear perspective). At this stage a simple framework of depiction is introduced which identifies five components or phases: the objectives of the picture, the idea chosen to embody the objectives, the model (essentially, the visual ‘subject matter’), the characteristics of the view and finally the substantive picture or depiction itself. This framework draws attention to the way in which each of the five phases presents an opportunity for decision-making about representation. The framework is used and refined throughout the thesis. Since pictures are considered in some everyday sense to be ‘realistic’ (otherwise, in terms of this thesis, they would not count as depictions), the nature of realism is considered at some length. The apparently unitary concept is broken down into several different types of realism and it is argued that, like the different spatial cues, each lends itself to particular objectives intended for the artefact. From these several types, two approaches to realism are identified, one prioritising the creation of a true illusion (that the picture is in fact a scene) and the other (of which there are innumerably more examples both across cultures and over historical time) one which evokes aspects of vision without aiming to exactly imitate the optical stimulus of the scene. Various reasons for the latter approach, and the variety of spatial practices to which it leads, are discussed. In addition to analysing traditional pictures, computer graphics images are discussed in conjunction with the claims for realism offered by their authors. In the process, informational and affective aspects of picture-making are distinguished, a distinction which it is argued is useful and too seldom made. Discussion of still pictures identifies the evocation of movement (and other aspects of time) as one of the principal motives for departing from attempts at straightforward optical matching. The discussion proceeds to the subject of film where, perhaps surprisingly now that the depiction of movement is possible, the lack of straightforward imitation of the optical is noteworthy again. This is especially true of the relationship between shots rather than within them; the reasons for this are analysed. This reinforces the argument that the spatial form of the fiction film, like that of other kinds of depiction, arises from its objectives, presenting realism once again as a contingent concept. The separation of depiction into two broad classes – one which aims to negate its own mediation, to seem transparent to what it depicts, and one which presents the fact of depiction ostensively to the viewer – is carried through from still pictures, via film, into a discussion of factual television and finally of digital interactive media. The example of factual television is chosen to emphasise how, despite the similarities between the technologies of film and television, spatial practices within some television genres contrast strongly with those of the mainstream fiction film. By considering historic examples, it is shown that many of the spatial practices now familiar in factual television were gradually expunged from the classical film when the latter became centred on the concerns of narrative fiction. By situating the spaces of interactive media in the context of other kinds of pictorial space, questions are addressed concerning the transferability of spatial usages from traditional media to those which are interactive. During the thesis the spatial practices of still-picture-making, film and television are characterised as ‘mature’ and ‘expressive’ (terms which are defined in the text). By contrast the spatial practices of digital interactive media are seen to be immature and inexpressive. It is argued that this is to some degree inevitable given the context in which interactive media artefacts are made and experienced – the lack of a shared ‘language’ or languages in any new media. Some of the difficult spatial problems which digital interactive media need to overcome are identified, especially where, as is currently normal, interaction is based on the relation between a pointer and visible objects within a depiction. The range of existing practice in digital interactive media is classified in a seven-part taxonomy, which again makes use of the objective-idea-model-view-picture framework, and again draws out the difference between self-concealing approaches to depiction and those which offer awareness of depiction as a significant component of the experience. The analysis indicates promising lines of enquiry for the future and emphasises the need for further innovation. Finally the main arguments are summarised and the thesis concludes with a short discussion of the implications for design arising from the key concepts identified – expressivity and maturity, pragmatism and realism
    corecore