3,226 research outputs found

    Optical techniques for 3D surface reconstruction in computer-assisted laparoscopic surgery

    Get PDF
    One of the main challenges for computer-assisted surgery (CAS) is to determine the intra-opera- tive morphology and motion of soft-tissues. This information is prerequisite to the registration of multi-modal patient-specific data for enhancing the surgeon’s navigation capabilites by observ- ing beyond exposed tissue surfaces and for providing intelligent control of robotic-assisted in- struments. In minimally invasive surgery (MIS), optical techniques are an increasingly attractive approach for in vivo 3D reconstruction of the soft-tissue surface geometry. This paper reviews the state-of-the-art methods for optical intra-operative 3D reconstruction in laparoscopic surgery and discusses the technical challenges and future perspectives towards clinical translation. With the recent paradigm shift of surgical practice towards MIS and new developments in 3D opti- cal imaging, this is a timely discussion about technologies that could facilitate complex CAS procedures in dynamic and deformable anatomical regions

    LiveCap: Real-time Human Performance Capture from Monocular Video

    Full text link
    We present the first real-time human performance capture approach that reconstructs dense, space-time coherent deforming geometry of entire humans in general everyday clothing from just a single RGB video. We propose a novel two-stage analysis-by-synthesis optimization whose formulation and implementation are designed for high performance. In the first stage, a skinned template model is jointly fitted to background subtracted input video, 2D and 3D skeleton joint positions found using a deep neural network, and a set of sparse facial landmark detections. In the second stage, dense non-rigid 3D deformations of skin and even loose apparel are captured based on a novel real-time capable algorithm for non-rigid tracking using dense photometric and silhouette constraints. Our novel energy formulation leverages automatically identified material regions on the template to model the differing non-rigid deformation behavior of skin and apparel. The two resulting non-linear optimization problems per-frame are solved with specially-tailored data-parallel Gauss-Newton solvers. In order to achieve real-time performance of over 25Hz, we design a pipelined parallel architecture using the CPU and two commodity GPUs. Our method is the first real-time monocular approach for full-body performance capture. Our method yields comparable accuracy with off-line performance capture techniques, while being orders of magnitude faster

    Learning to Reconstruct People in Clothing from a Single RGB Camera

    No full text
    We present a learning-based model to infer the personalized 3D shape of people from a few frames (1-8) of a monocular video in which the person is moving, in less than 10 seconds with a reconstruction accuracy of 5mm. Our model learns to predict the parameters of a statistical body model and instance displacements that add clothing and hair to the shape. The model achieves fast and accurate predictions based on two key design choices. First, by predicting shape in a canonical T-pose space, the network learns to encode the images of the person into pose-invariant latent codes, where the information is fused. Second, based on the observation that feed-forward predictions are fast but do not always align with the input images, we predict using both, bottom-up and top-down streams (one per view) allowing information to flow in both directions. Learning relies only on synthetic 3D data. Once learned, the model can take a variable number of frames as input, and is able to reconstruct shapes even from a single image with an accuracy of 6mm. Results on 3 different datasets demonstrate the efficacy and accuracy of our approach

    V-SLAM-AIDED PHOTOGRAMMETRY TO PROCESS FISHEYE MULTI-CAMERA SYSTEMS SEQUENCES

    Get PDF
    The advent of mobile mapping systems (MMSs) and computer vision algorithms has enriched a wide range of navigation and mapping tasks such as localisation, 3D motion estimation and 3D mapping. This study focuses on Visual Simultaneous Localisation and Mapping (V-SLAM) in the context of two in-houses MMSs: Ant3D, a patented five-fisheye multi-camera rig and GeoRizon, a high-resolution stereo fisheye rig. The aim is to leverage V-SLAM to enhance the systems performance in near-real-time and non-real-time 3D reconstruction applications. The research investigates both Monocular and Stereo V-SLAM applied to both MMSs and tackles the challenge of combining the V-SLAM estimated trajectory of one or a pair of cameras with known multi-camera relative orientation. We propose a state-of-the-art code that serves as a flexible and extensible platform for MMSs image acquisition and processing, along with an adapted version of the well-established ORB-SLAM3.0. Evaluation is performed in a cultural heritage challenging setup: the Minguzzi spiral staircase in the Duomo di Milano Cathedral. Performed tests highlight that introducing V-SLAM trajectories as well as pre-calibrated interior orientation and multi-camera constraints improve speed, applicability and accuracy of 3D surveys
    • …
    corecore