58 research outputs found

    Light Field Diffusion for Single-View Novel View Synthesis

    Full text link
    Single-view novel view synthesis, the task of generating images from new viewpoints based on a single reference image, is an important but challenging task in computer vision. Recently, Denoising Diffusion Probabilistic Model (DDPM) has become popular in this area due to its strong ability to generate high-fidelity images. However, current diffusion-based methods directly rely on camera pose matrices as viewing conditions, globally and implicitly introducing 3D constraints. These methods may suffer from inconsistency among generated images from different perspectives, especially in regions with intricate textures and structures. In this work, we present Light Field Diffusion (LFD), a conditional diffusion-based model for single-view novel view synthesis. Unlike previous methods that employ camera pose matrices, LFD transforms the camera view information into light field encoding and combines it with the reference image. This design introduces local pixel-wise constraints within the diffusion models, thereby encouraging better multi-view consistency. Experiments on several datasets show that our LFD can efficiently generate high-fidelity images and maintain better 3D consistency even in intricate regions. Our method can generate images with higher quality than NeRF-based models, and we obtain sample quality similar to other diffusion-based models but with only one-third of the model size

    TLS-bridged co-prediction of tree-level multifarious stem structure variables from worldview-2 panchromatic imagery: a case study of the boreal forest

    Full text link
    In forest ecosystem studies, tree stem structure variables (SSVs) proved to be an essential kind of parameters, and now simultaneously deriving SSVs of as many kinds as possible at large scales is preferred for enhancing the frontier studies on marcoecosystem ecology and global carbon cycle. For this newly emerging task, satellite imagery such as WorldView-2 panchromatic images (WPIs) is used as a potential solution for co-prediction of tree-level multifarious SSVs, with static terrestrial laser scanning (TLS) assumed as a ‘bridge’. The specific operation is to pursue the allometric relationships between TLS-derived SSVs and WPI-derived feature parameters, and regression analyses with one or multiple explanatory variables are applied to deduce the prediction models (termed as Model1s and Model2s). In the case of Picea abies, Pinus sylvestris, Populus tremul and Quercus robur in a boreal forest, tests showed that Model1s and Model2s for different tree species can be derived (e.g. the maximum R2 = 0.574 for Q. robur). Overall, this study basically validated the algorithm proposed for co-prediction of multifarious SSVs, and the contribution is equivalent to developing a viable solution for SSV-estimation upscaling, which is useful for large-scale investigations of forest understory, macroecosystem ecology, global vegetation dynamics and global carbon cycle.This work was financially supported in part by the National Natural Science Foundation of China [grant numbers 41471281 and 31670718] and in part by the SRF for ROCS, SEM, China. (41471281 - National Natural Science Foundation of China; 31670718 - National Natural Science Foundation of China; SRF for ROCS, SEM, China)http://www-tandfonline-com.ezproxy.bu.edu/doi/abs/10.1080/17538947.2016.1247473?journalCode=tjde20Published versio

    Diffeomorphic Deformation via Sliced Wasserstein Distance Optimization for Cortical Surface Reconstruction

    Full text link
    Mesh deformation is a core task for 3D mesh reconstruction, but defining an efficient discrepancy between predicted and target meshes remains an open problem. A prevalent approach in current deep learning is the set-based approach which measures the discrepancy between two surfaces by comparing two randomly sampled point-clouds from the two meshes with Chamfer pseudo-distance. Nevertheless, the set-based approach still has limitations such as lacking a theoretical guarantee for choosing the number of points in sampled point-clouds, and the pseudo-metricity and the quadratic complexity of the Chamfer divergence. To address these issues, we propose a novel metric for learning mesh deformation. The metric is defined by sliced Wasserstein distance on meshes represented as probability measures that generalize the set-based approach. By leveraging probability measure space, we gain flexibility in encoding meshes using diverse forms of probability measures, such as continuous, empirical, and discrete measures via \textit{varifold} representation. After having encoded probability measures, we can compare meshes by using the sliced Wasserstein distance which is an effective optimal transport distance with linear computational complexity and can provide a fast statistical rate for approximating the surface of meshes. Furthermore, we employ a neural ordinary differential equation (ODE) to deform the input surface into the target shape by modeling the trajectories of the points on the surface. Our experiments on cortical surface reconstruction demonstrate that our approach surpasses other competing methods in multiple datasets and metrics

    CVTHead: One-shot Controllable Head Avatar with Vertex-feature Transformer

    Full text link
    Reconstructing personalized animatable head avatars has significant implications in the fields of AR/VR. Existing methods for achieving explicit face control of 3D Morphable Models (3DMM) typically rely on multi-view images or videos of a single subject, making the reconstruction process complex. Additionally, the traditional rendering pipeline is time-consuming, limiting real-time animation possibilities. In this paper, we introduce CVTHead, a novel approach that generates controllable neural head avatars from a single reference image using point-based neural rendering. CVTHead considers the sparse vertices of mesh as the point set and employs the proposed Vertex-feature Transformer to learn local feature descriptors for each vertex. This enables the modeling of long-range dependencies among all the vertices. Experimental results on the VoxCeleb dataset demonstrate that CVTHead achieves comparable performance to state-of-the-art graphics-based methods. Moreover, it enables efficient rendering of novel human heads with various expressions, head poses, and camera views. These attributes can be explicitly controlled using the coefficients of 3DMMs, facilitating versatile and realistic animation in real-time scenarios.Comment: WACV202

    MedGen3D: A Deep Generative Framework for Paired 3D Image and Mask Generation

    Full text link
    Acquiring and annotating sufficient labeled data is crucial in developing accurate and robust learning-based models, but obtaining such data can be challenging in many medical image segmentation tasks. One promising solution is to synthesize realistic data with ground-truth mask annotations. However, no prior studies have explored generating complete 3D volumetric images with masks. In this paper, we present MedGen3D, a deep generative framework that can generate paired 3D medical images and masks. First, we represent the 3D medical data as 2D sequences and propose the Multi-Condition Diffusion Probabilistic Model (MC-DPM) to generate multi-label mask sequences adhering to anatomical geometry. Then, we use an image sequence generator and semantic diffusion refiner conditioned on the generated mask sequences to produce realistic 3D medical images that align with the generated masks. Our proposed framework guarantees accurate alignment between synthetic images and segmentation maps. Experiments on 3D thoracic CT and brain MRI datasets show that our synthetic data is both diverse and faithful to the original data, and demonstrate the benefits for downstream segmentation tasks. We anticipate that MedGen3D's ability to synthesize paired 3D medical images and masks will prove valuable in training deep learning models for medical imaging tasks.Comment: Submitted to MICCAI 2023. Project Page: https://krishan999.github.io/MedGen3D

    Identity-Aware Hand Mesh Estimation and Personalization from RGB Images

    Full text link
    Reconstructing 3D hand meshes from monocular RGB images has attracted increasing amount of attention due to its enormous potential applications in the field of AR/VR. Most state-of-the-art methods attempt to tackle this task in an anonymous manner. Specifically, the identity of the subject is ignored even though it is practically available in real applications where the user is unchanged in a continuous recording session. In this paper, we propose an identity-aware hand mesh estimation model, which can incorporate the identity information represented by the intrinsic shape parameters of the subject. We demonstrate the importance of the identity information by comparing the proposed identity-aware model to a baseline which treats subject anonymously. Furthermore, to handle the use case where the test subject is unseen, we propose a novel personalization pipeline to calibrate the intrinsic shape parameters using only a few unlabeled RGB images of the subject. Experiments on two large scale public datasets validate the state-of-the-art performance of our proposed method.Comment: ECCV 2022. Github https://github.com/deyingk/PersonalizedHandMeshEstimatio

    Diffeomorphic Image Registration with Neural Velocity Field

    Full text link
    Diffeomorphic image registration, offering smooth transformation and topology preservation, is required in many medical image analysis tasks.Traditional methods impose certain modeling constraints on the space of admissible transformations and use optimization to find the optimal transformation between two images. Specifying the right space of admissible transformations is challenging: the registration quality can be poor if the space is too restrictive, while the optimization can be hard to solve if the space is too general. Recent learning-based methods, utilizing deep neural networks to learn the transformation directly, achieve fast inference, but face challenges in accuracy due to the difficulties in capturing the small local deformations and generalization ability. Here we propose a new optimization-based method named DNVF (Diffeomorphic Image Registration with Neural Velocity Field) which utilizes deep neural network to model the space of admissible transformations. A multilayer perceptron (MLP) with sinusoidal activation function is used to represent the continuous velocity field and assigns a velocity vector to every point in space, providing the flexibility of modeling complex deformations as well as the convenience of optimization. Moreover, we propose a cascaded image registration framework (Cas-DNVF) by combining the benefits of both optimization and learning based methods, where a fully convolutional neural network (FCN) is trained to predict the initial deformation, followed by DNVF for further refinement. Experiments on two large-scale 3D MR brain scan datasets demonstrate that our proposed methods significantly outperform the state-of-the-art registration methods.Comment: WACV 202

    Causes of death and conditional survival estimates of long-term lung cancer survivors.

    Get PDF
    INTRODUCTION: Lung cancer ranks the leading cause of cancer-related death worldwide. This retrospective cohort study was designed to determine time-dependent death hazards of diverse causes and conditional survival of lung cancer. METHODS: We collected 816,436 lung cancer cases during 2000-2015 in the SEER database, after exclusion, 612,100 cases were enrolled for data analyses. Cancer-specific survival, overall survival and dynamic death hazard were assessed in this study. Additionally, based on the FDA approval time of Nivolumab in 2015, we evaluated the effect of immunotherapy on metastatic patients\u27 survival by comparing cases in 2016-2018 (immunotherapy era, n=7135) and those in 2013-2016 (non-immunotherapy era, n=42061). RESULTS: Of the 612,100 patients, 285,705 were women, the mean (SD) age was 68.3 (11.0) years old. 252,558 patients were characterized as lung adenocarcinoma, 133,302 cases were lung squamous cell carcinoma, and only 78,700 cases were small cell lung carcinomas. TNM stage was I in 140,518 cases, II in 38,225 cases, III in 159,095 cases, and IV in 274,262 patients. 164,394 cases underwent surgical intervention. The 5-y overall survival and cancer-specific survival were 54.2% and 73.8%, respectively. The 5-y conditional survival rate of cancer-specific survival is improved in a time-dependent pattern, while conditional overall survival tends to be steady after 5-y follow-up. Except from age, hazard disparities of other risk factors (such as stage and surgery) diminished over time according to the conditional survival curves. After 8 years since diagnosis, mortality hazard from other causes became higher than that from lung cancer. This critical time point was earlier in elder patients while was postponed in patients with advanced stages. Moreover, both cancer-specific survival and overall survival of metastatic patients in immunotherapy era were significantly better than those in non-immunotherapy era (P CONCLUSIONS: Our findings expand on previous studies by demonstrating that non-lung-cancer related death risk becomes more and more predominant over the course of follow-up, and we establish a personalized web-based calculator to determine this critical time point for long-term survivors. We also confirmed the survival benefit of advanced lung cancer patients in immunotherapy era

    Hybrid-CSR: Coupling Explicit and Implicit Shape Representation for Cortical Surface Reconstruction

    Full text link
    We present Hybrid-CSR, a geometric deep-learning model that combines explicit and implicit shape representations for cortical surface reconstruction. Specifically, Hybrid-CSR begins with explicit deformations of template meshes to obtain coarsely reconstructed cortical surfaces, based on which the oriented point clouds are estimated for the subsequent differentiable poisson surface reconstruction. By doing so, our method unifies explicit (oriented point clouds) and implicit (indicator function) cortical surface reconstruction. Compared to explicit representation-based methods, our hybrid approach is more friendly to capture detailed structures, and when compared with implicit representation-based methods, our method can be topology aware because of end-to-end training with a mesh-based deformation module. In order to address topology defects, we propose a new topology correction pipeline that relies on optimization-based diffeomorphic surface registration. Experimental results on three brain datasets show that our approach surpasses existing implicit and explicit cortical surface reconstruction methods in numeric metrics in terms of accuracy, regularity, and consistency
    • …
    corecore