115 research outputs found

    A Closest Point Proposal for MCMC-based Probabilistic Surface Registration

    Full text link
    We propose to view non-rigid surface registration as a probabilistic inference problem. Given a target surface, we estimate the posterior distribution of surface registrations. We demonstrate how the posterior distribution can be used to build shape models that generalize better and show how to visualize the uncertainty in the established correspondence. Furthermore, in a reconstruction task, we show how to estimate the posterior distribution of missing data without assuming a fixed point-to-point correspondence. We introduce the closest-point proposal for the Metropolis-Hastings algorithm. Our proposal overcomes the limitation of slow convergence compared to a random-walk strategy. As the algorithm decouples inference from modeling the posterior using a propose-and-verify scheme, we show how to choose different distance measures for the likelihood model. All presented results are fully reproducible using publicly available data and our open-source implementation of the registration framework

    Super edge 4-points congruent sets-based point cloud global registration

    Get PDF
    With the acceleration in three-dimensional (3D) high-frame-rate sensing technologies, dense point clouds collected from multiple standpoints pose a great challenge for the accuracy and efficiency of registration. The combination of coarse registration and fine registration has been extensively promoted. Unlike the requirement of small movements between scan pairs in fine registration, coarse registration can match scans with arbitrary initial poses. The state-of-the-art coarse methods, Super 4-Points Congruent Sets algorithm based on the 4-Points Congruent Sets, improves the speed of registration to a linear order via smart indexing. However, the lack of reduction in the scale of original point clouds limits the application. Besides, the coplanarity of registration bases prevents further reduction of search space. This paper proposes a novel registration method called the Super Edge 4-Points Congruent Sets to address the above problems. The proposed algorithm follows a three-step procedure, including boundary segmentation, overlapping regions extraction, and bases selection. Firstly, an improved method based on vector angle is used to segment the original point clouds aiming to thin out the scale of the initial point clouds. Furthermore, overlapping regions extraction is executed to find out the overlapping regions on the contour. Finally, the proposed method selects registration bases conforming to the distance constraints from the candidate set without consideration about coplanarity. Experiments on various datasets with different characteristics have demonstrated that the average time complexity of the proposed algorithm is improved by 89.76%, and the accuracy is improved by 5 mm on average than the Super 4-Points Congruent Sets algorithm. More encouragingly, the experimental results show that the proposed algorithm can be applied to various restrictive cases, such as few overlapping regions and massive noise. Therefore, the algorithm proposed in this paper is a faster and more robust method than Super 4-Points Congruent Sets under the guarantee of the promised quality.</jats:p

    Neural Semantic Surface Maps

    Full text link
    We present an automated technique for computing a map between two genus-zero shapes, which matches semantically corresponding regions to one another. Lack of annotated data prohibits direct inference of 3D semantic priors; instead, current State-of-the-art methods predominantly optimize geometric properties or require varying amounts of manual annotation. To overcome the lack of annotated training data, we distill semantic matches from pre-trained vision models: our method renders the pair of 3D shapes from multiple viewpoints; the resulting renders are then fed into an off-the-shelf image-matching method which leverages a pretrained visual model to produce feature points. This yields semantic correspondences, which can be projected back to the 3D shapes, producing a raw matching that is inaccurate and inconsistent between different viewpoints. These correspondences are refined and distilled into an inter-surface map by a dedicated optimization scheme, which promotes bijectivity and continuity of the output map. We illustrate that our approach can generate semantic surface-to-surface maps, eliminating manual annotations or any 3D training data requirement. Furthermore, it proves effective in scenarios with high semantic complexity, where objects are non-isometrically related, as well as in situations where they are nearly isometric

    Continuous Modeling of 3D Building Rooftops From Airborne LIDAR and Imagery

    Get PDF
    In recent years, a number of mega-cities have provided 3D photorealistic virtual models to support the decisions making process for maintaining the cities' infrastructure and environment more effectively. 3D virtual city models are static snap-shots of the environment and represent the status quo at the time of their data acquisition. However, cities are dynamic system that continuously change over time. Accordingly, their virtual representation need to be regularly updated in a timely manner to allow for accurate analysis and simulated results that decisions are based upon. The concept of "continuous city modeling" is to progressively reconstruct city models by accommodating their changes recognized in spatio-temporal domain, while preserving unchanged structures. However, developing a universal intelligent machine enabling continuous modeling still remains a challenging task. Therefore, this thesis proposes a novel research framework for continuously reconstructing 3D building rooftops using multi-sensor data. For achieving this goal, we first proposes a 3D building rooftop modeling method using airborne LiDAR data. The main focus is on the implementation of an implicit regularization method which impose a data-driven building regularity to noisy boundaries of roof planes for reconstructing 3D building rooftop models. The implicit regularization process is implemented in the framework of Minimum Description Length (MDL) combined with Hypothesize and Test (HAT). Secondly, we propose a context-based geometric hashing method to align newly acquired image data with existing building models. The novelty is the use of context features to achieve robust and accurate matching results. Thirdly, the existing building models are refined by newly proposed sequential fusion method. The main advantage of the proposed method is its ability to progressively refine modeling errors frequently observed in LiDAR-driven building models. The refinement process is conducted in the framework of MDL combined with HAT. Markov Chain Monte Carlo (MDMC) coupled with Simulated Annealing (SA) is employed to perform a global optimization. The results demonstrates that the proposed continuous rooftop modeling methods show a promising aspects to support various critical decisions by not only reconstructing 3D rooftop models accurately, but also by updating the models using multi-sensor data

    Matching Misaligned Two-Resolution Metrology Data

    Get PDF
    Multi-resolution metrology devices co-exist in today's manufacturing environment, producing coordinate measurements complementing each other. Typically, the high-resolution device produces a scarce but accurate dataset, whereas the low-resolution one produces a dense but less accurate dataset. Research has shown that combining the two datasets of different resolutions makes better predictions of the geometric features of a manufactured part. A challenge, however, is how to effectively match each high-resolution data point to a low-resolution point that measures approximately the same physical location. A solution to this matching problem appears a prerequisite to a good final prediction. This dissertation solves this metrology matching problem by formulating it as a quadratic integer programming, aiming at minimizing the maximum inter-point-distance difference (maxIPDdiff) among all potential correspondences. Due to the combinatorial nature of the optimization model, solving it to optimality is computationally prohibitive even for a small problem size. In order to solve real-life sized problems within a reasonable amount of time, a two-stage matching framework (TSMF) is proposed. The TSMF approach follows a coarse-to-fine search strategy and consists of down-sampling the full size problem, solving the down-sampled problem to optimality, extending the solution of the down-sampled problem to the full size problem, and refining the solution using iterative local search. Many manufactured parts are designed with symmetric features; that is, many part surfaces are invariant (are mapped to themselves) to certain intrinsic reflections and/or rotations. Dealing with parts surfaces with symmetric features makes the metrology matching problem even more challenging. The new challenge is that, due to this symmetry, alignment performance metrics such as maxIPDdiff and root mean square error are not able to differentiate between (a) correct solutions/correspondences that are orientationally consistent with the underlying true correspondences and (b) incorrect but seemingly correct solutions that can be obtained by applying the surface's intrinsic reflections and/or rotations to a correct set of correspondences. To address this challenge, a filtering procedure is proposed to supplement the TSMF approach. Specifically, the filtering procedure works by generating a solution pool that contains a group of plausible candidate sets of correspondences and subsequently filtering this pool in order to select a correct set of correspondences from the pool. Numerical experiments show that the TSMF approach outperforms two widely-used point set registration alternatives, the iterative closest point (ICP) and coherent point drift methods (CPD), in terms of several performance metrics. Moreover, compared to ICP and CPD, the TSMF approach scales very well as the instance size increases, and is robust with respect to the initial misalignment degree between the two datasets. The numerical results also show that, when enhanced with the proposed filtering procedure, TSMF exhibits much better alignment performance than TSMF without filtering, CPD and ICP in terms of both orientation correctness of the selected solution and several other performance metrics. Furthermore, in terms of computational performance, TSMF (with and without filtering) can solve real-life sized metrology data matching problems within a reasonable amount of time. Therefore, they are both well suitable to serve as an off-line tool in the manufacturing quality control process

    Towards a framework for multi class statistical modelling of shape, intensity, and kinematics in medical images

    Get PDF
    Statistical modelling has become a ubiquitous tool for analysing of morphological variation of bone structures in medical images. For radiological images, the shape, relative pose between the bone structures and the intensity distribution are key features often modelled separately. A wide range of research has reported methods that incorporate these features as priors for machine learning purposes. Statistical shape, appearance (intensity profile in images) and pose models are popular priors to explain variability across a sample population of rigid structures. However, a principled and robust way to combine shape, pose and intensity features has been elusive for four main reasons: 1) heterogeneity of the data (data with linear and non-linear natural variation across features); 2) sub-optimal representation of three-dimensional Euclidean motion; 3) artificial discretization of the models; and 4) lack of an efficient transfer learning process to project observations into the latent space. This work proposes a novel statistical modelling framework for multiple bone structures. The framework provides a latent space embedding shape, pose and intensity in a continuous domain allowing for new approaches to skeletal joint analysis from medical images. First, a robust registration method for multi-volumetric shapes is described. Both sampling and parametric based registration algorithms are proposed, which allow the establishment of dense correspondence across volumetric shapes (such as tetrahedral meshes) while preserving the spatial relationship between them. Next, the framework for developing statistical shape-kinematics models from in-correspondence multi-volumetric shapes embedding image intensity distribution, is presented. The framework incorporates principal geodesic analysis and a non-linear metric for modelling the spatial orientation of the structures. More importantly, as all the features are in a joint statistical space and in a continuous domain; this permits on-demand marginalisation to a region or feature of interest without training separate models. Thereafter, an automated prediction of the structures in images is facilitated by a model-fitting method leveraging the models as priors in a Markov chain Monte Carlo approach. The framework is validated using controlled experimental data and the results demonstrate superior performance in comparison with state-of-the-art methods. Finally, the application of the framework for analysing computed tomography images is presented. The analyses include estimation of shape, kinematic and intensity profiles of bone structures in the shoulder and hip joints. For both these datasets, the framework is demonstrated for segmentation, registration and reconstruction, including the recovery of patient-specific intensity profile. The presented framework realises a new paradigm in modelling multi-object shape structures, allowing for probabilistic modelling of not only shape, but also relative pose and intensity as well as the correlations that exist between them. Future work will aim to optimise the framework for clinical use in medical image analysis

    Exploiting Structural Regularities and Beyond: Vision-based Localization and Mapping in Man-Made Environments

    Get PDF
    Image-based estimation of camera motion, known as visual odometry (VO), plays a very important role in many robotic applications such as control and navigation of unmanned mobile robots, especially when no external navigation reference signal is available. The core problem of VO is the estimation of the camera’s ego-motion (i.e. tracking) either between successive frames, namely relative pose estimation, or with respect to a global map, namely absolute pose estimation. This thesis aims to develop efficient, accurate and robust VO solutions by taking advantage of structural regularities in man-made environments, such as piece-wise planar structures, Manhattan World and more generally, contours and edges. Furthermore, to handle challenging scenarios that are beyond the limits of classical sensor based VO solutions, we investigate a recently emerging sensor — the event camera and study on event-based mapping — one of the key problems in the event-based VO/SLAM. The main achievements are summarized as follows. First, we revisit an old topic on relative pose estimation: accurately and robustly estimating the fundamental matrix given a collection of independently estimated homograhies. Three classical methods are reviewed and then we show a simple but nontrivial two-step normalization within the direct linear method that achieves similar performance to the less attractive and more computationally intensive hallucinated points based method. Second, an efficient 3D rotation estimation algorithm for depth cameras in piece-wise planar environments is presented. It shows that by using surface normal vectors as an input, planar modes in the corresponding density distribution function can be discovered and continuously tracked using efficient non-parametric estimation techniques. The relative rotation can be estimated by registering entire bundles of planar modes by using robust L1-norm minimization. Third, an efficient alternative to the iterative closest point algorithm for real-time tracking of modern depth cameras in ManhattanWorlds is developed. We exploit the common orthogonal structure of man-made environments in order to decouple the estimation of the rotation and the three degrees of freedom of the translation. The derived camera orientation is absolute and thus free of long-term drift, which in turn benefits the accuracy of the translation estimation as well. Fourth, we look into a more general structural regularity—edges. A real-time VO system that uses Canny edges is proposed for RGB-D cameras. Two novel alternatives to classical distance transforms are developed with great properties that significantly improve the classical Euclidean distance field based methods in terms of efficiency, accuracy and robustness. Finally, to deal with challenging scenarios that go beyond what standard RGB/RGB-D cameras can handle, we investigate the recently emerging event camera and focus on the problem of 3D reconstruction from data captured by a stereo event-camera rig moving in a static scene, such as in the context of stereo Simultaneous Localization and Mapping
    • …
    corecore