5,621 research outputs found

    Livrable D2.2 of the PERSEE project : Analyse/Synthese de Texture

    Get PDF
    Livrable D2.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D2.2 du projet. Son titre : Analyse/Synthese de Textur

    RealFlow: EM-based Realistic Optical Flow Dataset Generation from Videos

    Full text link
    Obtaining the ground truth labels from a video is challenging since the manual annotation of pixel-wise flow labels is prohibitively expensive and laborious. Besides, existing approaches try to adapt the trained model on synthetic datasets to authentic videos, which inevitably suffers from domain discrepancy and hinders the performance for real-world applications. To solve these problems, we propose RealFlow, an Expectation-Maximization based framework that can create large-scale optical flow datasets directly from any unlabeled realistic videos. Specifically, we first estimate optical flow between a pair of video frames, and then synthesize a new image from this pair based on the predicted flow. Thus the new image pairs and their corresponding flows can be regarded as a new training set. Besides, we design a Realistic Image Pair Rendering (RIPR) module that adopts softmax splatting and bi-directional hole filling techniques to alleviate the artifacts of the image synthesis. In the E-step, RIPR renders new images to create a large quantity of training data. In the M-step, we utilize the generated training data to train an optical flow network, which can be used to estimate optical flows in the next E-step. During the iterative learning steps, the capability of the flow network is gradually improved, so is the accuracy of the flow, as well as the quality of the synthesized dataset. Experimental results show that RealFlow outperforms previous dataset generation methods by a considerably large margin. Moreover, based on the generated dataset, our approach achieves state-of-the-art performance on two standard benchmarks compared with both supervised and unsupervised optical flow methods. Our code and dataset are available at https://github.com/megvii-research/RealFlowComment: ECCV 2022 Ora

    Rate-Distortion Analysis of Multiview Coding in a DIBR Framework

    Get PDF
    Depth image based rendering techniques for multiview applications have been recently introduced for efficient view generation at arbitrary camera positions. Encoding rate control has thus to consider both texture and depth data. Due to different structures of depth and texture images and their different roles on the rendered views, distributing the available bit budget between them however requires a careful analysis. Information loss due to texture coding affects the value of pixels in synthesized views while errors in depth information lead to shift in objects or unexpected patterns at their boundaries. In this paper, we address the problem of efficient bit allocation between textures and depth data of multiview video sequences. We adopt a rate-distortion framework based on a simplified model of depth and texture images. Our model preserves the main features of depth and texture images. Unlike most recent solutions, our method permits to avoid rendering at encoding time for distortion estimation so that the encoding complexity is not augmented. In addition to this, our model is independent of the underlying inpainting method that is used at decoder. Experiments confirm our theoretical results and the efficiency of our rate allocation strategy

    Direction Hole-Filling Method for a 3D View Generator

    Get PDF
    [[abstract]]Depth image-based rendering (DIBR) technology is an approach to creating a virtual 3D image from one single 2D image. A desired view can be synthesised at the receiver side using depth images to make transmission and storage efficient. While this technique has many advantages, one of the key challenges is how to fill the holes caused by disocclusion regions and wrong depth values in the warped left/right images. A common means to alleviate the sizes and the number of holes is to smooth the depth image. But smoothing results in geometric distortions and degrades the depth image quality. This study proposes a hole-filling method based on the oriented texture direction. Parallax correction is first implemented to mitigate the wrong depth values. Texture directional information is then probed in the background pixels where holes take place after warping. Next, in the warped image, holes are filled according to their directions. Experimental results showed that this algorithm preserves the depth information and greatly reduces the amount of geometric distortion.[[notice]]補正完

    Livrable D5.2 of the PERSEE project : 2D/3D Codec architecture

    Get PDF
    Livrable D5.2 du projet ANR PERSEECe rapport a été réalisé dans le cadre du projet ANR PERSEE (n° ANR-09-BLAN-0170). Exactement il correspond au livrable D5.2 du projet. Son titre : 2D/3D Codec architectur

    Automatic 3DS Conversion of Historical Aerial Photographs

    Get PDF
    In this paper we present a method for the generation of 3D stereo (3DS) pairs from sequences of historical aerial photographs. The goal of our work is to provide a stereoscopic display when the existing exposures are in a monocular sequence. Each input image is processed using its neighbours and a synthetic image is rendered, which, together with the original one, form a stereo pair. Promising results on real images taken from a historical photo archive are shown, that corroborate the viability of generating 3DS data from monocular footage

    Application for light field inpainting

    Get PDF
    Light Field (LF) imaging is a multimedia technology that can provide more immersive experience when visualizing a multimedia content with higher levels of realism compared to conventional imaging technologies. This technology is mainly promising for Virtual Reality (VR) since it displays real-world scenes in a way that users can experience the captured scenes in every position and every angle, due to its 4-dimensional LF representation. For these reasons, LF is a fast-growing technology, with so many topics to explore, being the LF inpainting the one that was explored in this dissertation. Image inpainting is an editing technique that allows synthesizing alternative content to fill in holes in an image. It is commonly used to fill missing parts in a scene and restore damaged images such that the modifications are correct and visually realistic. Applying traditional 2D inpainting techniques straightforwardly to LFs is very unlikely to result in a consistent inpainting in its all 4 dimensions. Usually, to inpaint a 4D LF content, 2D inpainting algorithms are used to inpaint a particular point of view and then 4D inpainting propagation algorithms propagate the inpainted result for the whole 4D LF data. Based on this idea of 4D inpainting propagation, some 4D LF inpainting techniques have been recently proposed in the literature. Therefore, this dissertation proposes to design and implement an LF inpainting application that can be used by the public that desire to work in this field and/or explore and edit LFs.Campos de luz é uma tecnologia multimédia que fornece uma experiência mais imersiva ao visualizar conteúdo multimédia com níveis mais altos de realismo, comparando a tecnologias convencionais de imagem. Esta tecnologia é promissora, principalmente para Realidade Virtual, pois exibe cenas capturadas do mundo real de forma que utilizadores as possam experimentar em todas as posições e ângulos, devido à sua representação em 4 dimensões. Por isso, esta é tecnologia em rápido crescimento, com tantos tópicos para explorar, sendo o inpainting o explorado nesta dissertação. Inpainting de imagens é uma técnica de edição, permitindo sintetizar conteúdo alternativo para preencher lacunas numa imagem. Comumente usado para preencher partes que faltam numa cena e restaurar imagens danificadas, de forma que as modificações sejam corretas e visualmente realistas. É muito improvável que aplicar técnicas tradicionais de inpainting 2D diretamente a campos de luz resulte num inpainting consistente em todas as suas 4 dimensões. Normalmente, para fazer inpainting num conteúdo 4D de campos de luz, os algoritmos de inpainting 2D são usados para fazer inpainting de um ponto de vista específico e, seguidamente, os algoritmos de propagação de inpainting 4D propagam o resultado do inpainting para todos os dados do campo de luz 4D. Com base nessa ideia de propagação de inpainting 4D, algumas técnicas foram recentemente propostas na literatura. Assim, esta dissertação propõe-se a conceber e implementar uma aplicação de inpainting de campos de luz que possa ser utilizada pelo público que pretenda trabalhar nesta área e/ou explorar e editar campos de luz

    3D Hierarchical Refinement and Augmentation for Unsupervised Learning of Depth and Pose from Monocular Video

    Full text link
    Depth and ego-motion estimations are essential for the localization and navigation of autonomous robots and autonomous driving. Recent studies make it possible to learn the per-pixel depth and ego-motion from the unlabeled monocular video. A novel unsupervised training framework is proposed with 3D hierarchical refinement and augmentation using explicit 3D geometry. In this framework, the depth and pose estimations are hierarchically and mutually coupled to refine the estimated pose layer by layer. The intermediate view image is proposed and synthesized by warping the pixels in an image with the estimated depth and coarse pose. Then, the residual pose transformation can be estimated from the new view image and the image of the adjacent frame to refine the coarse pose. The iterative refinement is implemented in a differentiable manner in this paper, making the whole framework optimized uniformly. Meanwhile, a new image augmentation method is proposed for the pose estimation by synthesizing a new view image, which creatively augments the pose in 3D space but gets a new augmented 2D image. The experiments on KITTI demonstrate that our depth estimation achieves state-of-the-art performance and even surpasses recent approaches that utilize other auxiliary tasks. Our visual odometry outperforms all recent unsupervised monocular learning-based methods and achieves competitive performance to the geometry-based method, ORB-SLAM2 with back-end optimization.Comment: 10 pages, 7 figures, under revie

    Dynamic shape capture using multi-view photometric stereo

    Full text link
    • …
    corecore