1,035 research outputs found

    Classic Mosaics and Visual Correspondence via Graph-Cut based Energy Optimization

    Get PDF
    Computer graphics and computer vision were traditionally two distinct research fields focusing on opposite topics. Lately, they have been increasingly borrowing ideas and tools from each other. In this thesis, we investigate two problems in computer vision and graphics that rely on the same tool, namely energy optimization with graph cuts. In the area of computer graphics, we address the problem of generating artificial classic mosaics, still and animated. The main purpose of artificial mosaics is to help a user to create digital art. First we reformulate our previous static mosaic work in a more principled global optimization framework. Then, relying on our still mosaic algorithm, we develop a method for producing animated mosaics directly from real video sequences, which is the first such method, we believe. Our mosaic animation style is uniquely expressive. Our method estimates the motion of the pixels in the video, renders the frames with mosaic effect based on both the colour and motion information from the input video. This algorithm relies extensively on our novel motion segmentation approach, which is a computer vision problem. To improve the quality of our animated mosaics, we need to improve the motion segmentation algorithm. Since motion and stereo problems have a similar setup, we start with the problem of finding visual correspondence for stereo, which has the advantage of having datasets with ground truth, useful for evaluation. Most previous methods for stereo correspondence do not provide any measure of reliability in their estimates. We aim to find the regions for which correspondence can be determined reliably. Our main idea is to find corresponding regions that have a sufficiently strong texture cue on the boundary, since texture is a reliable cue for matching. Unlike the previous work, we allow the disparity range within each such region to vary smoothly, instead of being constant. This produces blob-like semi-dense visual features for which we have a high confidence in their estimated ranges of disparities

    A Survey on Video-based Graphics and Video Visualization

    Get PDF

    3D confocal laser-scanning microscopy for large-area imaging of the corneal subbasal nerve plexus

    Get PDF
    The capability of corneal confocal microscopy (CCM) to acquire high-resolution in vivo images of the densely innervated human cornea has gained considerable interest in using this non-invasive technique as an objective diagnostic tool for staging peripheral neuropathies. Morphological alterations of the corneal subbasal nerve plexus (SNP) assessed by CCM have been shown to correlate well with the progression of neuropathic diseases and even predict future-incident neuropathy. Since the field of view of single CCM images is insufficient for reliable characterisation of nerve morphology, several image mosaicking techniques have been developed to facilitate the assessment of the SNP in large-area visualisations. Due to the limited depth of field of confocal microscopy, these approaches are highly sensitive to small deviations of the focus plane from the SNP layer. Our contribution proposes a new automated solution, combining guided eye movements for rapid expansion of the acquired SNP area and axial focus plane oscillations to guarantee complete imaging of the SNP. We present results of a feasibility study using the proposed setup to evaluate different oscillation settings. By comparing different image selection approaches, we show that automatic tissue classification algorithms are essential to create high-quality mosaic images from the acquired 3D dataset

    FetReg2021: A Challenge on Placental Vessel Segmentation and Registration in Fetoscopy

    Get PDF
    Fetoscopy laser photocoagulation is a widely adopted procedure for treating Twin-to-Twin Transfusion Syndrome (TTTS). The procedure involves photocoagulation pathological anastomoses to regulate blood exchange among twins. The procedure is particularly challenging due to the limited field of view, poor manoeuvrability of the fetoscope, poor visibility, and variability in illumination. These challenges may lead to increased surgery time and incomplete ablation. Computer-assisted intervention (CAI) can provide surgeons with decision support and context awareness by identifying key structures in the scene and expanding the fetoscopic field of view through video mosaicking. Research in this domain has been hampered by the lack of high-quality data to design, develop and test CAI algorithms. Through the Fetoscopic Placental Vessel Segmentation and Registration (FetReg2021) challenge, which was organized as part of the MICCAI2021 Endoscopic Vision challenge, we released the first largescale multicentre TTTS dataset for the development of generalized and robust semantic segmentation and video mosaicking algorithms. For this challenge, we released a dataset of 2060 images, pixel-annotated for vessels, tool, fetus and background classes, from 18 in-vivo TTTS fetoscopy procedures and 18 short video clips. Seven teams participated in this challenge and their model performance was assessed on an unseen test dataset of 658 pixel-annotated images from 6 fetoscopic procedures and 6 short clips. The challenge provided an opportunity for creating generalized solutions for fetoscopic scene understanding and mosaicking. In this paper, we present the findings of the FetReg2021 challenge alongside reporting a detailed literature review for CAI in TTTS fetoscopy. Through this challenge, its analysis and the release of multi-centre fetoscopic data, we provide a benchmark for future research in this field

    Image-based rendering and synthesis

    Get PDF
    Multiview imaging (MVI) is currently the focus of some research as it has a wide range of applications and opens up research in other topics and applications, including virtual view synthesis for three-dimensional (3D) television (3DTV) and entertainment. However, a large amount of storage is needed by multiview systems and are difficult to construct. The concept behind allowing 3D scenes and objects to be visualized in a realistic way without full 3D model reconstruction is image-based rendering (IBR). Using images as the primary substrate, IBR has many potential applications including for video games, virtual travel and others. The technique creates new views of scenes which are reconstructed from a collection of densely sampled images or videos. The IBR concept has different classification such as knowing 3D models and the lighting conditions and be rendered using conventional graphic techniques. Another is lightfield or lumigraph rendering which depends on dense sampling with no or very little geometry for rendering without recovering the exact 3D-models.published_or_final_versio

    Identifying and quantifying the abundance of economically important palms in tropical moist forest using UAV imagery

    Get PDF
    Sustainable management of non-timber forest products such as palm fruits is crucial for the long-term conservation of intact forest. A major limitation to expanding sustainable management of palms has been the need for precise information about the resources at scales of tens to hundreds of hectares, while typical ground-based surveys only sample small areas. In recent years, small unmanned aerial vehicles (UAVs) have become an important tool for mapping forest areas as they are cheap and easy to transport, and they provide high spatial resolution imagery of remote areas. We developed an object-based classification workflow for RGB UAV imagery which aims to identify and delineate palm tree crowns in the tropical rainforest by combining image processing and GIS functionalities using color and textural information in an integrative way to show one of the potential uses of UAVs in tropical forests. Ten permanent forest plots with 1170 reference palm trees were assessed from October to December 2017. The results indicate that palm tree crowns could be clearly identified and, in some cases, quantified following the workflow. The best results were obtained using the random forest classifier with an 85% overall accuracy and 0.82 kappa index.Publisher PDFPeer reviewe

    The Video Mesh: A Data Structure for Image-based Three-dimensional Video Editing

    Get PDF
    This paper introduces the video mesh, a data structure for representing video as 2.5D “paper cutouts.” The video mesh allows interactive editing of moving objects and modeling of depth, which enables 3D effects and post-exposure camera control. The video mesh sparsely encodes optical flow as well as depth, and handles occlusion using local layering and alpha mattes. Motion is described by a sparse set of points tracked over time. Each point also stores a depth value. The video mesh is a triangulation over this point set and per-pixel information is obtained by interpolation. The user rotoscopes occluding contours and we introduce an algorithm to cut the video mesh along them. Object boundaries are refined with per-pixel alpha values. The video mesh is at its core a set of texture mapped triangles, we leverage graphics hardware to enable interactive editing and rendering of a variety of effects. We demonstrate the effectiveness of our representation with special effects such as 3D viewpoint changes, object insertion, depth-of-field manipulation, and 2D to 3D video conversion
    corecore