195 research outputs found

    Fast Monte-Carlo Localization on Aerial Vehicles using Approximate Continuous Belief Representations

    Full text link
    Size, weight, and power constrained platforms impose constraints on computational resources that introduce unique challenges in implementing localization algorithms. We present a framework to perform fast localization on such platforms enabled by the compressive capabilities of Gaussian Mixture Model representations of point cloud data. Given raw structural data from a depth sensor and pitch and roll estimates from an on-board attitude reference system, a multi-hypothesis particle filter localizes the vehicle by exploiting the likelihood of the data originating from the mixture model. We demonstrate analysis of this likelihood in the vicinity of the ground truth pose and detail its utilization in a particle filter-based vehicle localization strategy, and later present results of real-time implementations on a desktop system and an off-the-shelf embedded platform that outperform localization results from running a state-of-the-art algorithm on the same environment

    SIFT Flow: Dense Correspondence across Scenes and its Applications

    Get PDF
    While image alignment has been studied in different areas of computer vision for decades, aligning images depicting different scenes remains a challenging problem. Analogous to optical flow where an image is aligned to its temporally adjacent frame, we propose SIFT flow, a method to align an image to its nearest neighbors in a large image corpus containing a variety of scenes. The SIFT flow algorithm consists of matching densely sampled, pixel-wise SIFT features between two images, while preserving spatial discontinuities. The SIFT features allow robust matching across different scene/object appearances, whereas the discontinuity-preserving spatial model allows matching of objects located at different parts of the scene. Experiments show that the proposed approach robustly aligns complex scene pairs containing significant spatial differences. Based on SIFT flow, we propose an alignment-based large database framework for image analysis and synthesis, where image information is transferred from the nearest neighbors to a query image according to the dense scene correspondence. This framework is demonstrated through concrete applications, such as motion field prediction from a single image, motion synthesis via object transfer, satellite image registration and face recognition

    Dense soft tissue 3D reconstruction refined with super-pixel segmentation for robotic abdominal surgery

    Get PDF
    Purpose: Single-incision laparoscopic surgery decreases postoperative infections, but introduces limitations in the surgeon’s maneuverability and in the surgical field of view. This work aims at enhancing intra-operative surgical visualization by exploiting the 3D information about the surgical site. An interactive guidance system is proposed wherein the pose of preoperative tissue models is updated online. A critical process involves the intra-operative acquisition of tissue surfaces. It can be achieved using stereoscopic imaging and 3D reconstruction techniques. This work contributes to this process by proposing new methods for improved dense 3D reconstruction of soft tissues, which allows a more accurate deformation identification and facilitates the registration process. Methods: Two methods for soft tissue 3D reconstruction are proposed: Method 1 follows the traditional approach of the block matching algorithm. Method 2 performs a nonparametric modified census transform to be more robust to illumination variation. The simple linear iterative clustering (SLIC) super-pixel algorithm is exploited for disparity refinement by filling holes in the disparity images. Results: The methods were validated using two video datasets from the Hamlyn Centre, achieving an accuracy of 2.95 and 1.66 mm, respectively. A comparison with ground-truth data demonstrated the disparity refinement procedure: (1) increases the number of reconstructed points by up to 43% and (2) does not affect the accuracy of the 3D reconstructions significantly. Conclusion: Both methods give results that compare favorably with the state-of-the-art methods. The computational time constraints their applicability in real time, but can be greatly improved by using a GPU implementation

    Free Viewpoint Video Based on Stitching Technique

    Get PDF
    Image stitching is a technique used for creating one panoramic scene from multiple images. It is used in panoramic photography and video where the viewer can only scroll horizontally and vertically across the scene. However, stitching has not been used for creating free-viewpoint videos (FVV) where viewers can change their viewing points freely and smoothly while playing the video. current research, implemented FVV playing system using image stitching, this system allows users to enjoy the capability of moving their viewpoint freely and smoothly. To develop this system, user should capture MVV from different viewpoints and with appropriate region area for each pair of cameras then the system stitch the overlapped video to create stitched video/videos to display it in FVV playing system with applying freely and smoothly switching and interpolation of viewpoints over video playback. Current research evaluated the performance of video playing system based on system idea, system accuracy, smoothness, and user satisfaction. The results of evaluation have been very positive in most aspects

    Viewpoint-Free Photography for Virtual Reality

    Get PDF
    Viewpoint-free photography, i.e., interactively controlling the viewpoint of a photograph after capture, is a standing challenge. In this thesis, we investigate algorithms to enable viewpoint-free photography for virtual reality (VR) from casual capture, i.e., from footage easily captured with consumer cameras. We build on an extensive body of work in image-based rendering (IBR). Given images of an object or scene, IBR methods aim to predict the appearance of an image taken from a novel perspective. Most IBR methods focus on full or near-interpolation, where the output viewpoints either lie directly between captured images, or nearby. These methods are not suitable for VR, where the user has significant range of motion and can look in all directions. Thus, it is essential to create viewpoint-free photos with a wide field-of-view and sufficient positional freedom to cover the range of motion a user might experience in VR. We focus on two VR experiences: 1) Seated VR experiences, where the user can lean in different directions. This simplifies the problem, as the scene is only observed from a small range of viewpoints. Thus, we focus on easy capture, showing how to turn panorama-style capture into 3D photos, a simple representation for viewpoint-free photos, and also how to speed up processing so users can see the final result on-site. 2) Room-scale VR experiences, where the user can explore vastly different perspectives. This is challenging: More input footage is needed, maintaining real-time display rates becomes difficult, view-dependent appearance and object backsides need to be modelled, all while preventing noticeable mistakes. We address these challenges by: (1) creating refined geometry for each input photograph, (2) using a fast tiled rendering algorithm to achieve real-time display rates, and (3) using a convolutional neural network to hide visual mistakes during compositing. Overall, we provide evidence that viewpoint-free photography is feasible from casual capture. We thoroughly compare with the state-of-the-art, showing that our methods achieve both a numerical improvement and a clear increase in visual quality for both seated and room-scale VR experiences

    Appearance Modelling and Reconstruction for Navigation in Minimally Invasive Surgery

    Get PDF
    Minimally invasive surgery is playing an increasingly important role for patient care. Whilst its direct patient benefit in terms of reduced trauma, improved recovery and shortened hospitalisation has been well established, there is a sustained need for improved training of the existing procedures and the development of new smart instruments to tackle the issue of visualisation, ergonomic control, haptic and tactile feedback. For endoscopic intervention, the small field of view in the presence of a complex anatomy can easily introduce disorientation to the operator as the tortuous access pathway is not always easy to predict and control with standard endoscopes. Effective training through simulation devices, based on either virtual reality or mixed-reality simulators, can help to improve the spatial awareness, consistency and safety of these procedures. This thesis examines the use of endoscopic videos for both simulation and navigation purposes. More specifically, it addresses the challenging problem of how to build high-fidelity subject-specific simulation environments for improved training and skills assessment. Issues related to mesh parameterisation and texture blending are investigated. With the maturity of computer vision in terms of both 3D shape reconstruction and localisation and mapping, vision-based techniques have enjoyed significant interest in recent years for surgical navigation. The thesis also tackles the problem of how to use vision-based techniques for providing a detailed 3D map and dynamically expanded field of view to improve spatial awareness and avoid operator disorientation. The key advantage of this approach is that it does not require additional hardware, and thus introduces minimal interference to the existing surgical workflow. The derived 3D map can be effectively integrated with pre-operative data, allowing both global and local 3D navigation by taking into account tissue structural and appearance changes. Both simulation and laboratory-based experiments are conducted throughout this research to assess the practical value of the method proposed
    • …