2,673 research outputs found

    Realistic Virtual Cuts

    Get PDF

    NiftyNet: a deep-learning platform for medical imaging

    Get PDF
    Medical image analysis and computer-assisted intervention problems are increasingly being addressed with deep-learning-based solutions. Established deep-learning platforms are flexible but do not provide specific functionality for medical image analysis and adapting them for this application requires substantial implementation effort. Thus, there has been substantial duplication of effort and incompatible infrastructure developed across many research groups. This work presents the open-source NiftyNet platform for deep learning in medical imaging. The ambition of NiftyNet is to accelerate and simplify the development of these solutions, and to provide a common mechanism for disseminating research outputs for the community to use, adapt and build upon. NiftyNet provides a modular deep-learning pipeline for a range of medical imaging applications including segmentation, regression, image generation and representation learning applications. Components of the NiftyNet pipeline including data loading, data augmentation, network architectures, loss functions and evaluation metrics are tailored to, and take advantage of, the idiosyncracies of medical image analysis and computer-assisted intervention. NiftyNet is built on TensorFlow and supports TensorBoard visualization of 2D and 3D images and computational graphs by default. We present 3 illustrative medical image analysis applications built using NiftyNet: (1) segmentation of multiple abdominal organs from computed tomography; (2) image regression to predict computed tomography attenuation maps from brain magnetic resonance images; and (3) generation of simulated ultrasound images for specified anatomical poses. NiftyNet enables researchers to rapidly develop and distribute deep learning solutions for segmentation, regression, image generation and representation learning applications, or extend the platform to new applications.Comment: Wenqi Li and Eli Gibson contributed equally to this work. M. Jorge Cardoso and Tom Vercauteren contributed equally to this work. 26 pages, 6 figures; Update includes additional applications, updated author list and formatting for journal submissio

    Mean value coordinates–based caricature and expression synthesis

    Get PDF
    We present a novel method for caricature synthesis based on mean value coordinates (MVC). Our method can be applied to any single frontal face image to learn a specified caricature face pair for frontal and 3D caricature synthesis. This technique only requires one or a small number of exemplar pairs and a natural frontal face image training set, while the system can transfer the style of the exemplar pair across individuals. Further exaggeration can be fulfilled in a controllable way. Our method is further applied to facial expression transfer, interpolation, and exaggeration, which are applications of expression editing. Additionally, we have extended our approach to 3D caricature synthesis based on the 3D version of MVC. With experiments we demonstrate that the transferred expressions are credible and the resulting caricatures can be characterized and recognized

    Detail-preserving and Content-aware Variational Multi-view Stereo Reconstruction

    Full text link
    Accurate recovery of 3D geometrical surfaces from calibrated 2D multi-view images is a fundamental yet active research area in computer vision. Despite the steady progress in multi-view stereo reconstruction, most existing methods are still limited in recovering fine-scale details and sharp features while suppressing noises, and may fail in reconstructing regions with few textures. To address these limitations, this paper presents a Detail-preserving and Content-aware Variational (DCV) multi-view stereo method, which reconstructs the 3D surface by alternating between reprojection error minimization and mesh denoising. In reprojection error minimization, we propose a novel inter-image similarity measure, which is effective to preserve fine-scale details of the reconstructed surface and builds a connection between guided image filtering and image registration. In mesh denoising, we propose a content-aware p\ell_{p}-minimization algorithm by adaptively estimating the pp value and regularization parameters based on the current input. It is much more promising in suppressing noise while preserving sharp features than conventional isotropic mesh smoothing. Experimental results on benchmark datasets demonstrate that our DCV method is capable of recovering more surface details, and obtains cleaner and more accurate reconstructions than state-of-the-art methods. In particular, our method achieves the best results among all published methods on the Middlebury dino ring and dino sparse ring datasets in terms of both completeness and accuracy.Comment: 14 pages,16 figures. Submitted to IEEE Transaction on image processin

    Doctor of Philosophy

    Get PDF
    dissertationVisualization and exploration of volumetric datasets has been an active area of research for over two decades. During this period, volumetric datasets used by domain users have evolved from univariate to multivariate. The volume datasets are typically explored and classified via transfer function design and visualized using direct volume rendering. To improve classification results and to enable the exploration of multivariate volume datasets, multivariate transfer functions emerge. In this dissertation, we describe our research on multivariate transfer function design. To improve the classification of univariate volumes, various one-dimensional (1D) or two-dimensional (2D) transfer function spaces have been proposed; however, these methods work on only some datasets. We propose a novel transfer function method that provides better classifications by combining different transfer function spaces. Methods have been proposed for exploring multivariate simulations; however, these approaches are not suitable for complex real-world datasets and may be unintuitive for domain users. To this end, we propose a method based on user-selected samples in the spatial domain to make complex multivariate volume data visualization more accessible for domain users. However, this method still requires users to fine-tune transfer functions in parameter space transfer function widgets, which may not be familiar to them. We therefore propose GuideME, a novel slice-guided semiautomatic multivariate volume exploration approach. GuideME provides the user, an easy-to-use, slice-based user interface that suggests the feature boundaries and allows the user to select features via click and drag, and then an optimal transfer function is automatically generated by optimizing a response function. Throughout the exploration process, the user does not need to interact with the parameter views at all. Finally, real-world multivariate volume datasets are also usually of large size, which is larger than the GPU memory and even the main memory of standard work stations. We propose a ray-guided out-of-core, interactive volume rendering and efficient query method to support large and complex multivariate volumes on standard work stations

    Scene relighting and editing for improved object insertion

    Get PDF
    Abstract. The goal of this thesis is to develop a scene relighting and object insertion pipeline using Neural Radiance Fields (NeRF) to incorporate one or more objects into an outdoor environment scene. The output is a 3D mesh that embodies decomposed bidirectional reflectance distribution function (BRDF) characteristics, which interact with varying light source positions and strengths. To achieve this objective, the thesis is divided into two sub-tasks. The first sub-task involves extracting visual information about the outdoor environment from a sparse set of corresponding images. A neural representation is constructed, providing a comprehensive understanding of the constituent elements, such as materials, geometry, illumination, and shadows. The second sub-task involves generating a neural representation of the inserted object using either real-world images or synthetic data. To accomplish these objectives, the thesis draws on existing literature in computer vision and computer graphics. Different approaches are assessed to identify their advantages and disadvantages, with detailed descriptions of the chosen techniques provided, highlighting their functioning to produce the ultimate outcome. Overall, this thesis aims to provide a framework for compositing and relighting that is grounded in NeRF and allows for the seamless integration of objects into outdoor environments. The outcome of this work has potential applications in various domains, such as visual effects, gaming, and virtual reality

    Tensor Regression with Applications in Neuroimaging Data Analysis

    Get PDF
    Classical regression methods treat covariates as a vector and estimate a corresponding vector of regression coefficients. Modern applications in medical imaging generate covariates of more complex form such as multidimensional arrays (tensors). Traditional statistical and computational methods are proving insufficient for analysis of these high-throughput data due to their ultrahigh dimensionality as well as complex structure. In this article, we propose a new family of tensor regression models that efficiently exploit the special structure of tensor covariates. Under this framework, ultrahigh dimensionality is reduced to a manageable level, resulting in efficient estimation and prediction. A fast and highly scalable estimation algorithm is proposed for maximum likelihood estimation and its associated asymptotic properties are studied. Effectiveness of the new methods is demonstrated on both synthetic and real MRI imaging data.Comment: 27 pages, 4 figure
    corecore