23,741 research outputs found

    Novel View Synthesis from a Single RGBD Image for Indoor Scenes

    Full text link
    In this paper, we propose an approach for synthesizing novel view images from a single RGBD (Red Green Blue-Depth) input. Novel view synthesis (NVS) is an interesting computer vision task with extensive applications. Methods using multiple images has been well-studied, exemplary ones include training scene-specific Neural Radiance Fields (NeRF), or leveraging multi-view stereo (MVS) and 3D rendering pipelines. However, both are either computationally intensive or non-generalizable across different scenes, limiting their practical value. Conversely, the depth information embedded in RGBD images unlocks 3D potential from a singular view, simplifying NVS. The widespread availability of compact, affordable stereo cameras, and even LiDARs in contemporary devices like smartphones, makes capturing RGBD images more accessible than ever. In our method, we convert an RGBD image into a point cloud and render it from a different viewpoint, then formulate the NVS task into an image translation problem. We leveraged generative adversarial networks to style-transfer the rendered image, achieving a result similar to a photograph taken from the new perspective. We explore both unsupervised learning using CycleGAN and supervised learning with Pix2Pix, and demonstrate the qualitative results. Our method circumvents the limitations of traditional multi-image techniques, holding significant promise for practical, real-time applications in NVS.Comment: 2nd International Conference on Image Processing, Computer Vision and Machine Learning, November 202

    Unsupervised Learning of Depth and Ego-Motion from Video

    Full text link
    We present an unsupervised learning framework for the task of monocular depth and camera motion estimation from unstructured video sequences. We achieve this by simultaneously training depth and camera pose estimation networks using the task of view synthesis as the supervisory signal. The networks are thus coupled via the view synthesis objective during training, but can be applied independently at test time. Empirical evaluation on the KITTI dataset demonstrates the effectiveness of our approach: 1) monocular depth performing comparably with supervised methods that use either ground-truth pose or depth for training, and 2) pose estimation performing favorably with established SLAM systems under comparable input settings.Comment: Accepted to CVPR 2017. Project webpage: https://people.eecs.berkeley.edu/~tinghuiz/projects/SfMLearner

    Evaluation of learning-based techniques in novel view synthesis

    Get PDF
    Abstract. Novel view synthesis is a long-standing topic at the intersection of computer vision and computer graphics, where the fundamental goal is to synthesize an image from a novel viewpoint given a sparse set of reference images. The rapid development of deep learning has introduced a wide range of new ideas and methods in novel view synthesis where parts of the synthesis process are considered as a supervised learning problem. Specifically, neural scene representations paired with volume rendering have achieved state of the art results in novel view synthesis, but still remains a nascent field facing a lack of literature. This thesis presents an overview of learning-based view synthesis, experiments with state-of-the-art view synthesis methods, evaluates them quantitatively and qualitatively and finally discusses their properties. Furthermore, we introduce a novel multi-view stereo dataset captured with a hand-held camera and demonstrate the process of collecting and preparing multi-view stereo datasets for view synthesis. The findings in this thesis indicate that learning-based view synthesis methods excel at synthesizing plausible views from challenging scenes, including situations with complex geometry as well as transparent and reflective materials. Furthermore, we found that it is possible to render such scenes in real-time and greatly reduce the time to prepare a scene for view synthesis by using a pre-trained network that aggregates information from nearby views.Koneoppimisen soveltaminen uuden näkymän synteesissä. Tiivistelmä. Uuden näkymän synteesi on pitkäaikainen aihe konenäön ja tietokonegrafiikan risteyksessä, jossa tavoitteena on syntetisoida kuva uudesta näkökulmasta annetun kuvajoukon perusteella. Syväoppimisen nopea kehitys on synnyttänyt laajan kirjon uusia ideoita ja menetelmiä uuden näkymän synteesissä, jossa osia synteesiprosessista pidetään valvottuna oppimisongelmana. Erityisesti neuraaliset tilaesitykset yhdistettynä tilavuusrenderointiin ovat saavuttaneet huippuluokan tuloksia uuden näkymän synteesissä, mutta aihe on vielä kehittyvä tieteenala. Tässä opinnäytetyössä esitetään yleiskatsaus oppimispohjaiseen näkymän synteesiin, suoritetaan kokeellista tutkimusta uusimmilla synteesimenetelmillä, arvioidaan niitä kvantitatiivisesti ja kvalitatiivisesti sekä lopuksi keskustellaan niiden ominaisuuksista. Lisäksi esitellään uusi stereokuvien muodostama tietoainesto ja esitetään prosessi, jolla kerätään ja valmistellaan kyseisiä tietoaineistoja näkymän synteesiä varten. Työssä havaitaan, että oppimispohjaiset näkymäsynteesimenetelmät piirtävät erittäin aidolta näyttäviä näkymiä tietoaineiston pohjalta jopa tilanteissa, missä esiintyy monimutkaista geometriaa sekä läpinäkyviä ja heijastavia materiaaleja. Lisäksi havaitsimme, että syntetisointi on mahdollista suorittaa reaaliajassa ja että syntetisoinnin valmisteluaikaa voidaan myös lyhentää käyttämällä ennalta koulutettua verkkoa, joka kokoaa tietoja läheisistä näkymistä

    Direct virtual viewpoint synthesis from multiple viewpoints

    Get PDF
    • …
    corecore