121 research outputs found

    Diffeomorphic Deformation via Sliced Wasserstein Distance Optimization for Cortical Surface Reconstruction

    Full text link
    Mesh deformation is a core task for 3D mesh reconstruction, but defining an efficient discrepancy between predicted and target meshes remains an open problem. A prevalent approach in current deep learning is the set-based approach which measures the discrepancy between two surfaces by comparing two randomly sampled point-clouds from the two meshes with Chamfer pseudo-distance. Nevertheless, the set-based approach still has limitations such as lacking a theoretical guarantee for choosing the number of points in sampled point-clouds, and the pseudo-metricity and the quadratic complexity of the Chamfer divergence. To address these issues, we propose a novel metric for learning mesh deformation. The metric is defined by sliced Wasserstein distance on meshes represented as probability measures that generalize the set-based approach. By leveraging probability measure space, we gain flexibility in encoding meshes using diverse forms of probability measures, such as continuous, empirical, and discrete measures via \textit{varifold} representation. After having encoded probability measures, we can compare meshes by using the sliced Wasserstein distance which is an effective optimal transport distance with linear computational complexity and can provide a fast statistical rate for approximating the surface of meshes. Furthermore, we employ a neural ordinary differential equation (ODE) to deform the input surface into the target shape by modeling the trajectories of the points on the surface. Our experiments on cortical surface reconstruction demonstrate that our approach surpasses other competing methods in multiple datasets and metrics

    Point2Point : A Framework for Efficient Deep Learning on Hilbert sorted Point Clouds with applications in Spatio-Temporal Occupancy Prediction

    Full text link
    The irregularity and permutation invariance of point cloud data pose challenges for effective learning. Conventional methods for addressing this issue involve converting raw point clouds to intermediate representations such as 3D voxel grids or range images. While such intermediate representations solve the problem of permutation invariance, they can result in significant loss of information. Approaches that do learn on raw point clouds either have trouble in resolving neighborhood relationships between points or are too complicated in their formulation. In this paper, we propose a novel approach to representing point clouds as a locality preserving 1D ordering induced by the Hilbert space-filling curve. We also introduce Point2Point, a neural architecture that can effectively learn on Hilbert-sorted point clouds. We show that Point2Point shows competitive performance on point cloud segmentation and generation tasks. Finally, we show the performance of Point2Point on Spatio-temporal Occupancy prediction from Point clouds.Comment: 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

    Accurate, Fast and Controllable Image and Point Cloud Registration

    Get PDF
    Registration is the process of establishing spatial correspondences between two objects. Many downstream tasks, e.g, in image analysis, shape animation, can make use of these spatial correspondences. A variety of registration approaches have been developed over the last decades, but only recently registration approaches have been developed that make use of and can easily process the large data samples of the big data era. On the one hand, traditional optimization-based approaches are too slow and cannot take advantage of very large data sets. On the other hand, registration users expect more controllable and accurate solutions since most downstream tasks, e.g., facial animation and 3D reconstruction, increasingly rely on highly precise spatial correspondences. In recent years, deep network registration approaches have become popular as learning-based approaches are fast and can benefit from large-scale data during network training. However, how to make such deep-learning-based approached accurate and controllable is still a challenging problem that is far from being completely solved. This thesis explores fast, accurate and controllable solutions for image and point cloud registration. Specifically, for image registration, we first improve the accuracy of deep-learning-based approaches by introducing a general framework that consists of affine and non-parametric registration for both global and local deformation. We then design a more controllable image registration approach that image regions could be regularized differently according to their local attributes. For point cloud registration, existing works either are limited to small-scale problems, hardly handle complicated transformations or are slow to solve. We thus develop fast, accurate and controllable solutions for large-scale real-world registration problems via integrating optimal transport with deep geometric learning.Doctor of Philosoph

    Robust Learning under Distributional Shifts

    Get PDF
    Designing robust models is critical for reliable deployment of artificial intelligence systems. Deep neural networks perform exceptionally well on test samples that are drawn from the same distribution as the training set. However, they perform poorly when there is a mismatch between training and test conditions, a phenomenon called distributional shift. For instance, the perception system of a self-driving car can produce erratic predictions when it encounters a new test sample with a different illumination or weather condition not seen during training. Such inconsistencies are undesirable, and can potentially create life-threatening conditions as these models are deployed in safety-critical applications. In this dissertation, we develop several techniques for effectively handling distributional shifts in deep learning systems. In the first part of the dissertation, we focus on detecting out-of-distribution shifts that can be used for flagging outlier samples at test-time. We develop a likelihood estimation framework based on deep generative models for this task. In the second part, we study the domain adaptation problem where the objective is to tune the neural network models to adapt to a specific target distribution of interest. We design novel adaptation algorithms, understand and analyze them under various settings. In the last part of the dissertation, we develop robust learning algorithms that can generalize to novel distributional shifts. In particular, we focus on two types of shifts - covariate and adversarial shifts. All developed algorithms are rigorously evaluated on several benchmark datasets

    Inter-individual deep image reconstruction via hierarchical neural code conversion

    Get PDF
    The sensory cortex is characterized by general organizational principles such as topography and hierarchy. However, measured brain activity given identical input exhibits substantially different patterns across individuals. Although anatomical and functional alignment methods have been proposed in functional magnetic resonance imaging (fMRI) studies, it remains unclear whether and how hierarchical and fine-grained representations can be converted between individuals while preserving the encoded perceptual content. In this study, we trained a method of functional alignment called neural code converter that predicts a target subject’s brain activity pattern from a source subject given the same stimulus, and analyzed the converted patterns by decoding hierarchical visual features and reconstructing perceived images. The converters were trained on fMRI responses to identical sets of natural images presented to pairs of individuals, using the voxels on the visual cortex that covers from V1 through the ventral object areas without explicit labels of the visual areas. We decoded the converted brain activity patterns into the hierarchical visual features of a deep neural network using decoders pre-trained on the target subject and then reconstructed images via the decoded features. Without explicit information about the visual cortical hierarchy, the converters automatically learned the correspondence between visual areas of the same levels. Deep neural network feature decoding at each layer showed higher decoding accuracies from corresponding levels of visual areas, indicating that hierarchical representations were preserved after conversion. The visual images were reconstructed with recognizable silhouettes of objects even with relatively small numbers of data for converter training. The decoders trained on pooled data from multiple individuals through conversions led to a slight improvement over those trained on a single individual. These results demonstrate that the hierarchical and fine-grained representation can be converted by functional alignment, while preserving sufficient visual information to enable inter-individual visual image reconstruction
    corecore