9 research outputs found

    Direct virtual viewpoint synthesis from multiple viewpoints

    Get PDF

    Dynamic programming for multi-view disparity/depth estimation

    Get PDF

    In-Band Disparity Compensation for Multiview Image Compression and View Synthesis

    Get PDF

    Single-lens multi-ocular stereovision using prism

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Deep Multicameral Decoding for Localizing Unoccluded Object Instances from a Single RGB Image

    Full text link
    Occlusion-aware instance-sensitive segmentation is a complex task generally split into region-based segmentations, by approximating instances as their bounding box. We address the showcase scenario of dense homogeneous layouts in which this approximation does not hold. In this scenario, outlining unoccluded instances by decoding a deep encoder becomes difficult, due to the translation invariance of convolutional layers and the lack of complexity in the decoder. We therefore propose a multicameral design composed of subtask-specific lightweight decoder and encoder-decoder units, coupled in cascade to encourage subtask-specific feature reuse and enforce a learning path within the decoding process. Furthermore, the state-of-the-art datasets for occlusion-aware instance segmentation contain real images with few instances and occlusions mostly due to objects occluding the background, unlike dense object layouts. We thus also introduce a synthetic dataset of dense homogeneous object layouts, namely Mikado, which extensibly contains more instances and inter-instance occlusions per image than these public datasets. Our extensive experiments on Mikado and public datasets show that ordinal multiscale units within the decoding process prove more effective than state-of-the-art design patterns for capturing position-sensitive representations. We also show that Mikado is plausible with respect to real-world problems, in the sense that it enables the learning of performance-enhancing representations transferable to real images, while drastically reducing the need of hand-made annotations for finetuning. The proposed dataset will be made publicly available.Comment: International Journal of Computer Vision, Springer Verlag, 2020, Special Issue on Deep Learning for Robotic Visio

    Disparity Refinement based on Depth Image Layers Separation for Stereo Matching Algorithms

    Get PDF
    This paper presents a method to improve the raw disparity maps in the disparity refinement stage for stereo matching algorithm. The proposed algorithm will use the disparity depth map from the stereo matching algorithm as initial disparity depth output with a basic similarity metric of SAD. The similarity metric finds the pixel points between the left and right under the fixed window (FW) searching process. With this approach, the raw disparity depth map obtained is not smooth and contained errors particularly with the depth discontinuities and unable to detect the uniform areas and repetitive patterns. The initial disparity depth will be used to identify the layers of disparity depth map by adapting the Depth Image Layers Separation (DILS) algorithm that separate the layers of depth based on disparity range. Each particular disparity depth map distributed along the disparity range and can be divided into several layers. The layer will be mapped to segmented reference image to refine the disparity depth map. This method will be known as the Depth Layer Refinement (DLR) that using the disparity depth layers to refine the disparity ma

    Depth scene estimation from images captured with a plenoptic câmera

    Get PDF
    Monografia (graduação)—Université de Bordeaux, ENSEIRB-MATMECA, Universidade de Brasília, 2013.Uma câmera plenóptica, também conhecida como \textit{light field camera}, é um dispositivo que emprega uma rede de microlentes colocada entre a lente principal e o sensor da câmera para capturar a informação 4D da luz de uma cena. Este \textit{light field} nos permite conhecer a posição e o ângulo de incidência dos raios de luz capturados pela câmera e pode ser usado para melhorar as soluções de problemas relacionados com gráfico computacional e visão por computador. Com um campo de luz amostrado adquirido pela câmera, várias imagens da cena em baixa resolução estão disponíveis das quais é possível inferir a profundidade. Diferentemente do estéreo multivisão tradicional, estas vistas são capturadas pelo mesmo sensor, implicando que elas são adquiridas com os mesmos parâmetros da câmera. Da mesma forma, as vistas estão em geometria epipolar perfeita. Entretanto, outros problemas aparecem devido a esta configuração. O sensor da câmera usa um filtro de Bayer e a dematriçagem da imagem bruta implica em interferência entre as vistas, criando artefatos de imagem. A construção das vistas modifica o padrão de cores, adicionando complexidade para a dematriçagem. A resolução das vistas que podemos obter é outro problema. Como a informação angular e espacial são amostrados pelo mesmo sensor, existe um compromisso entre a resolução das vistas e o número de vistas disponíveis. Para a câmera Lytro, por exemplo, as vistas são construídas com uma resolução de aproximadamente 0,12 megapixels, implicando em \textit{aliasing} para a maioria das cenas. Este trabalho apresenta: Um técnica para construir as vistas a partir da imagem bruta capturada pela câmera; um método de estimação de disparidade adaptado às câmeras plenópticas que permite a estimação mesmo sem a dematriçagem; um novo conceito para representar a disparidade no caso do estéreo multi-vistas; um esquema de reconstrução e dematriçagem usando a informação da disparidade e os pixels de vistas vizinhas.A plenoptic camera, also known as light field camera, is a device that employs a microlens array placed between the main lens and the camera sensor to capture the 4D light field information about a scene. Such light field enable us to know the position and angle of incidence of the light rays captured by the camera and can be used to improve the solution of computer graphics and computer vision-related problems. With a sampled light field acquired from a plenoptic camera, several low-resolution views of the scene are available from which to infer depth. Unlike traditional multiview stereo, these views are captured by the same sensor, implying that they are acquired with the same camera parameters. Also the views are in perfect epipolar geometry. However, other problems arises with such configuration. The camera sensor uses a Bayer color filter and demosaicing the RAW data implies view cross-talk creating image artifacts. The rendering of the views modify the color pattern, adding complexity for demosaicing. The resolution of the views we can get is another problem. As the angular and spatial position of the light rays are sampled by the same sensor, there is a trade off between view resolution and number of available views. For Lytro camera, for example, the views are rendered with about 0.12 megapixels of resolution, implying in aliasing on the views for most of the scenes. This work present: an approach to render the views from the RAW image captured by the camera; a method of disparity estimation adapted to plenoptic cameras that enables the estimation even without executing the demosaicing; a new concept of representing the disparity information on the case of multiview stereo; a reconstruction and demosaicing scheme using the disparity information and the pixels of neighbouring views
    corecore