944 research outputs found

    Robust 2D and 3D registration with deep neural networks

    Get PDF
    Recovering 3D geometry is a crucial task in computer vision, essential for accurate world reconstruction and perception. Modern applications in AR, VR, autonomous driving, and medical imaging rely heavily on 3D and 4D reconstruction techniques. This thesis aims to enhance registration methods, which play a key role in reconstruction, by fusing classical multi-view geometry and deep neural networks. We explore this theme in three primary directions, each distinguished by registration dimensionality: 3D–3D, 3D–2D, and 2D–2D. First, we focus on improving the alignment of 3D point clouds in both rigid and non-rigid scenarios. In non-rigid 3D registration, traditional methods directly optimize a motion field between a source and target surface. This often leads to slow convergence and being trapped in local minima. We introduce a neural network-based scene flow to initialize the optimization, providing a more efficient and robust solution. Additionally, we present a novel surface normal estimation technique that aids both rigid and non-rigid registration. Unlike conventional methods that use a fixed global neighbor parameter, our approach employs a self-attention mechanism to adapt to local geometry variations. Second, we address the challenge of registering 2D images to 3D Neural Radiance Fields (NeRF) through joint optimization of NeRF and camera parameters. Original NeRF training mandates pre-processed camera parameters, creating a bottleneck in the workflow. Our approach allows for end-to-end camera parameter estimation during NeRF training while reusing the existing photometric loss in NeRF. We further extend this to account for larger camera movements by incorporating a monocular depth prior. Lastly, we propose a method for interest point discovery, which is beneficial for 2D image registration. Unlike existing interest point identification methods that suffer from significant viewpoint changes and occlusion boundaries, we propose a multi-view interest point discovery approach to address these limitations. Our method is trained in a self-supervised fashion with pure-geometric constraints that encourage point identification repeatability, sparsity, and multi-view consistency. In summary, this thesis explores the fusion of traditional multi-view geometry concepts with deep learning priors in various registration tasks, including point cloud registration, image-to-NeRF registration, and image-to-image registration

    Neighbourhood-insensitive point cloud normal estimation network

    Get PDF
    We introduce a novel self-attention-based normal estimation network that is able to focus softly on relevant points and adjust the softness by learning a temperature parameter, making it able to work naturally and effectively within a large neighbourhood range. As a result, our model outperforms all existing normal estimation algorithms by a large margin, achieving 94.1% accuracy in comparison with the previous state of the art of 91.2%, with a 25x smaller model and 12x faster inference time. We also use point-to-plane Iterative Closest Point (ICP) as an application case to show that our normal estimations lead to faster convergence than normal estimations from other methods, without manually fine-tuning neighbourhood range parameters. Code available at https://code.active.vision

    Neighbourhood-Insensitive Point Cloud Normal Estimation Network

    Get PDF
    We introduce a novel self-attention-based normal estimation network that is able to focus softly on relevant points and adjust the softness by learning a temperature parameter, making it able to work naturally and effectively within a large neighbourhood range. As a result, our model outperforms all existing normal estimation algorithms by a large margin, achieving 94.1% accuracy in comparison with the previous state of the art of 91.2%, with a 25x smaller model and 12x faster inference time. We also use point-to-plane Iterative Closest Point (ICP) as an application case to show that our normal estimations lead to faster convergence than normal estimations from other methods, without manually fine-tuning neighbourhood range parameters. Code available at https://code.active.vision.Comment: Accepted in BMVC 2020 as oral presentation. Code available at https://code.active.vision and project page at http://ninormal.active.visio

    Direct-PoseNet: Absolute Pose Regression with Photometric Consistency

    Full text link
    We present a relocalization pipeline, which combines an absolute pose regression (APR) network with a novel view synthesis based direct matching module, offering superior accuracy while maintaining low inference time. Our contribution is twofold: i) we design a direct matching module that supplies a photometric supervision signal to refine the pose regression network via differentiable rendering; ii) we modify the rotation representation from the classical quaternion to SO(3) in pose regression, removing the need for balancing rotation and translation loss terms. As a result, our network Direct-PoseNet achieves state-of-the-art performance among all other single-image APR methods on the 7-Scenes benchmark and the LLFF dataset

    Observation of the decays B0(s) → Ds1(2536)∓K±

    Get PDF
    [Abstract] This paper reports the observation of the decays B0(s) → Ds1(2536)∓K± using proton-proton collision data collected by the LHCb experiment, corresponding to an integrated luminosity of 9 fb−1. The branching fractions of these decays are measured relative to the normalisation channel B0 → D 0 K+K−. The Ds1(2536)− meson is reconstructed in the D∗(2007)0K− decay channel and the products of branching fractions are measured to be B(B0s→ Ds1(2536)∓K±) × B(Ds1(2536)−→ D∗(2007)0K−) = (2.49 ± 0.11 ± 0.12 ± 0.25 ± 0.06) × 10−5, B(B0→ Ds1(2536)∓K±) × B(Ds1(2536)−→ D∗(2007)0K−) = (0.510 ± 0.021 ± 0.036 ± 0.050) × 10−5. The first uncertainty is statistical, the second systematic, and the third arises from the uncertainty of the branching fraction of the B0 → D0K+K− normalisation channel. The last uncertainty in the B0s result is due to the limited knowledge of the fragmentation fraction ratio, fs/fd. The significance for the B0 s and B0 signals is larger than 10 σ. The ratio of the helicity amplitudes which governs the angular distribution of the Ds1(2536)− → D∗(2007)0K− decay is determined from the data. The ratio of the S- and D-wave amplitudes is found to be 1.11±0.15±0.06 and the phase difference between them 0.70 ± 0.09 ± 0.04 rad, where the first uncertainty is statistical and the second systematic.We express our gratitude to our colleagues in the CERN accelerator departments for the excellent performance of the LHC. We thank the technical and administrative staf at the LHCb institutes. We acknowledge support from CERN and from the national agencies: CAPES, CNPq, FAPERJ and FINEP (Brazil); MOST and NSFC (China); CNRS/IN2P3 (France); BMBF, DFG and MPG (Germany); INFN (Italy); NWO (Netherlands); MNiSW and NCN (Poland); MCID/IFA (Romania); MICINN (Spain); SNSF and SER (Switzerland); NASU (Ukraine); STFC (United Kingdom); DOE NP and NSF (USA). We acknowledge the computing resources that are provided by CERN, IN2P3 (France), KIT and DESY (Germany), INFN (Italy), SURF (Netherlands), PIC (Spain), GridPP (United Kingdom), CSCS (Switzerland), IFIN-HH (Romania), CBPF (Brazil), Polish WLCG (Poland) and NERSC (USA). We are indebted to the communities behind the multiple open-source software packages on which we depend. Individual groups or members have received support from ARC and ARDC (Australia); Minciencias (Colombia); AvH Foundation (Germany); EPLANET, Marie Skłodowska-Curie Actions, ERC and NextGenerationEU (European Union); A*MIDEX, ANR, IPhU and Labex P2IO, and Région Auvergne-Rhône-Alpes (France); Key Research Program of Frontier Sciences of CAS, CAS PIFI, CAS CCEPP, Fundamental Research Funds for the Central Universities, and Sci. & Tech. Program of Guangzhou (China); GVA, XuntaGal, GENCAT, Inditex, InTalent and Prog. Atracción Talento, CM (Spain); SRC (Sweden); the Leverhulme Trust, the Royal Society and UKRI (United Kingdom)

    Reachability-Based Confidence-Aware Probabilistic Collision Detection in Highway Driving

    Full text link
    Risk assessment is a crucial component of collision warning and avoidance systems in intelligent vehicles. To accurately detect potential vehicle collisions, reachability-based formal approaches have been developed to ensure driving safety, but suffer from over-conservatism, potentially leading to false-positive risk events in complicated real-world applications. In this work, we combine two reachability analysis techniques, i.e., backward reachable set (BRS) and stochastic forward reachable set (FRS), and propose an integrated probabilistic collision detection framework in highway driving. Within the framework, we can firstly use a BRS to formally check whether a two-vehicle interaction is safe; otherwise, a prediction-based stochastic FRS is employed to estimate a collision probability at each future time step. In doing so, the framework can not only identify non-risky events with guaranteed safety, but also provide accurate collision risk estimation in safety-critical events. To construct the stochastic FRS, we develop a neural network-based acceleration model for surrounding vehicles, and further incorporate confidence-aware dynamic belief to improve the prediction accuracy. Extensive experiments are conducted to validate the performance of the acceleration prediction model based on naturalistic highway driving data, and the efficiency and effectiveness of the framework with the infused confidence belief are tested both in naturalistic and simulated highway scenarios. The proposed risk assessment framework is promising in real-world applications.Comment: Under review at Engineering. arXiv admin note: text overlap with arXiv:2205.0135
    • …
    corecore