Search CORE

20 research outputs found

CP-SLAM: Collaborative Neural Point-based SLAM System

Author: Bao Hujun
Cui Zhaopeng
Hu Jiarui
Mao Mao
Zhang Guofeng
Publication venue
Publication date: 14/11/2023
Field of study

This paper presents a collaborative implicit neural simultaneous localization and mapping (SLAM) system with RGB-D image sequences, which consists of complete front-end and back-end modules including odometry, loop detection, sub-map fusion, and global refinement. In order to enable all these modules in a unified framework, we propose a novel neural point based 3D scene representation in which each point maintains a learnable neural feature for scene encoding and is associated with a certain keyframe. Moreover, a distributed-to-centralized learning strategy is proposed for the collaborative implicit SLAM to improve consistency and cooperation. A novel global optimization framework is also proposed to improve the system accuracy like traditional bundle adjustment. Experiments on various datasets demonstrate the superiority of the proposed method in both camera tracking and mapping.Comment: Accepted at NeurIPS 202

arXiv.org e-Print Archive

PATS: Patch Area Transportation with Subdivision for Local Feature Matching

Author: Bao Hujun
Cui Zhaopeng
Huang Zhaoyang
Li Hongsheng
Li Yijin
Ni Junjie
Zhang Guofeng
Publication venue
Publication date: 14/03/2023
Field of study

Local feature matching aims at establishing sparse correspondences between a pair of images. Recently, detectorfree methods present generally better performance but are not satisfactory in image pairs with large scale differences. In this paper, we propose Patch Area Transportation with Subdivision (PATS) to tackle this issue. Instead of building an expensive image pyramid, we start by splitting the original image pair into equal-sized patches and gradually resizing and subdividing them into smaller patches with the same scale. However, estimating scale differences between these patches is non-trivial since the scale differences are determined by both relative camera poses and scene structures, and thus spatially varying over image pairs. Moreover, it is hard to obtain the ground truth for real scenes. To this end, we propose patch area transportation, which enables learning scale differences in a self-supervised manner. In contrast to bipartite graph matching, which only handles one-to-one matching, our patch area transportation can deal with many-to-many relationships. PATS improves both matching accuracy and coverage, and shows superior performance in downstream tasks, such as relative pose estimation, visual localization, and optical flow estimation. The source code will be released to benefit the community.Comment: Project page: https://zju3dv.github.io/pat

arXiv.org e-Print Archive

SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field

Author: Bao Chong
Bao Hujun
Cui Zhaopeng
Fan Tianxing
Yang Bangbang
Yang Zesong
Zhang Guofeng
Zhang Yinda
Publication venue
Publication date: 25/03/2023
Field of study

Despite the great success in 2D editing using user-friendly tools, such as Photoshop, semantic strokes, or even text prompts, similar capabilities in 3D areas are still limited, either relying on 3D modeling skills or allowing editing within only a few categories. In this paper, we present a novel semantic-driven NeRF editing approach, which enables users to edit a neural radiance field with a single image, and faithfully delivers edited novel views with high fidelity and multi-view consistency. To achieve this goal, we propose a prior-guided editing field to encode fine-grained geometric and texture editing in 3D space, and develop a series of techniques to aid the editing process, including cyclic constraints with a proxy mesh to facilitate geometric supervision, a color compositing mechanism to stabilize semantic-driven texture editing, and a feature-cluster-based regularization to preserve the irrelevant content unchanged. Extensive experiments and editing examples on both real-world and synthetic data demonstrate that our method achieves photo-realistic 3D editing using only a single edited image, pushing the bound of semantic-driven editing in 3D real-world scenes. Our project webpage: https://zju3dv.github.io/sine/.Comment: Accepted to CVPR 2023. Project Page: https://zju3dv.github.io/sine

arXiv.org e-Print Archive

PVO: Panoptic Visual Odometry

Author: Bao Hujun
Chen Shuo
Cui Zhaopeng
Lan Xinyue
Ming Yuhang
Ye Weicai
Yu Xingyuan
Zhang Guofeng
Publication venue
Publication date: 26/03/2023
Field of study

We present PVO, a novel panoptic visual odometry framework to achieve more comprehensive modeling of the scene motion, geometry, and panoptic segmentation information. Our PVO models visual odometry (VO) and video panoptic segmentation (VPS) in a unified view, which makes the two tasks mutually beneficial. Specifically, we introduce a panoptic update module into the VO Module with the guidance of image panoptic segmentation. This Panoptic-Enhanced VO Module can alleviate the impact of dynamic objects in the camera pose estimation with a panoptic-aware dynamic mask. On the other hand, the VO-Enhanced VPS Module also improves the segmentation accuracy by fusing the panoptic segmentation result of the current frame on the fly to the adjacent frames, using geometric information such as camera pose, depth, and optical flow obtained from the VO Module. These two modules contribute to each other through recurrent iterative optimization. Extensive experiments demonstrate that PVO outperforms state-of-the-art methods in both visual odometry and video panoptic segmentation tasks.Comment: CVPR2023 Project page: https://zju3dv.github.io/pvo/ code: https://github.com/zju3dv/PV

arXiv.org e-Print Archive

BlinkFlow: A Dataset to Push the Limits of Event-based Optical Flow Estimation

Author: Bao Hujun
Chen Shuo
Cui Zhaopeng
Huang Zhaoyang
Li Hongsheng
Li Yijin
Shi Xiaoyu
Zhang Guofeng
Publication venue
Publication date: 14/03/2023
Field of study

Event cameras provide high temporal precision, low data rates, and high dynamic range visual perception, which are well-suited for optical flow estimation. While data-driven optical flow estimation has obtained great success in RGB cameras, its generalization performance is seriously hindered in event cameras mainly due to the limited and biased training data. In this paper, we present a novel simulator, BlinkSim, for the fast generation of large-scale data for event-based optical flow. BlinkSim consists of a configurable rendering engine and a flexible engine for event data simulation. By leveraging the wealth of current 3D assets, the rendering engine enables us to automatically build up thousands of scenes with different objects, textures, and motion patterns and render very high-frequency images for realistic event data simulation. Based on BlinkSim, we construct a large training dataset and evaluation benchmark BlinkFlow that contains sufficient, diversiform, and challenging event data with optical flow ground truth. Experiments show that BlinkFlow improves the generalization performance of state-of-the-art methods by more than 40% on average and up to 90%. Moreover, we further propose an Event optical Flow transFormer (E-FlowFormer) architecture. Powered by our BlinkFlow, E-FlowFormer outperforms the SOTA methods by up to 91% on MVSEC dataset and 14% on DSEC dataset and presents the best generalization performance

arXiv.org e-Print Archive

Safety risk assessment of subway shield construction under-crossing a river using CFA and FER

Author: Huihua Chen
Hujun Li
Jianhua Cheng
Ke Yang
Kuang He
Tianlin Cui
Yanlong Huang
Publication venue: Frontiers Media S.A.
Publication date: 01/02/2024
Field of study

Numerous subway projects are planned by China's city governments, and more subways can hardly avoid under-crossing rivers. While often being located in complex natural and social environments, subway shield construction under-crossing a river (SSCUR) is more susceptible to safety accidents, causing substantial casualties, and monetary losses. Therefore, there is an urgent need to investigate safety risks during SSCUR. The paper identified the safety risks during SSCUR by using a literature review and experts' evaluation, proposed a new safety risk assessment model by integrating confirmatory factor analysis (CFA) and fuzzy evidence reasoning (FER), and then selected a project to validate the feasibility of the proposed model. Research results show that (a) a safety risk list of SSCUR was identified, including 5 first-level safety risks and 38 second-level safety risks; (b) the proposed safety risk assessment model can be used to assess the safety risk of SSCUR; (c) safety inspection, safety organization and duty, quicksand layer, and high-pressure phreatic water were the high-level risks, and the onsite total safety risk was at the medium level; (d) management-type safety risks, environment-type safety risks, and personnel-type safety risks have higher expected utility values, and manager-type safety risks were expected have higher risk-utility values when compared to worker-type safety risks. The research can enrich the theoretical knowledge of SSCUR safety risk assessment and provide references to safety managers for conducting scientific and effective safety management on the construction site when a subway crosses under a river

Directory of Open Access Journals

Steady-State and Dynamic Rheological Properties of a Mineral Oil-Based Ferrofluid

Author: Hongchao Cui
Hujun Wang
Jiahao Dong
Yuan Meng
Zhenkun Li
Publication venue: 'MDPI AG'
Publication date: 13/09/2022
Field of study

In this study, nanoparticles were suspended in L-AN32 total loss system oil. The thixotropic yield behavior and viscoelastic behavior of ferrofluid were analyzed by steady-state and dynamic methods and explained according to the microscopic mechanism of magneto-rheology. The Herschel–Bulkley (H–B) model was used to fit the ferrofluid flow curves, and the observed static yield stress was greater than the dynamic yield stress. Both the static and dynamic yield stress values increased as the magnetic field increased, and the corresponding shear thinning viscosity curve increased more significantly as the magnetic field strength increased. The amplitude scanning results show that the linear viscoelastic region (LVE) is reached when the shear stress is 10%. The frequency scanning results showed that the storage modulus increased with the increase of the frequency at first. The storage modulus increased steadily at a higher frequency range, while the loss modulus increased slowly at the initial stage and rapidly at the later stage. In the amplitude sweep and frequency sweep experiments, the energy storage modulus and loss modulus are enhanced with the decrease of temperature. These findings are helpful to better understand the microscopic mechanism of magneto-rheology of ferrofluids, and also provide guidance for many practical applications

Multidisciplinary Digital Publishing Institute

Steady-State and Dynamic Rheological Properties of a Mineral Oil-Based Ferrofluid

Author: Hongchao Cui
Hujun Wang
Jiahao Dong
Yuan Meng
Zhenkun Li
Publication venue: MDPI AG
Publication date: 01/09/2022
Field of study

Directory of Open Access Journals

Automatic SAR Change Detection Based on Visual Saliency and Multi-Hierarchical Fuzzy Clustering

Author: Cui Bin
Du Peijun
Peng Yao
Yin Hujun
Zhang Yonghong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/08/2022
Field of study

The University of Manchester - Institutional Repository

PNeRFLoc: Visual Localization with Point-Based Neural Radiance Fields

Author: Bao Hujun
Cui Zhaopeng
Mao Mao
Yang Luwei
Zhao Boming
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 24/03/2024
Field of study

Due to the ability to synthesize high-quality novel views, Neural Radiance Fields (NeRF) has been recently exploited to improve visual localization in a known environment. However, the existing methods mostly utilize NeRF for data augmentation to improve the regression model training, and their performances on novel viewpoints and appearances are still limited due to the lack of geometric constraints. In this paper, we propose a novel visual localization framework, i.e., PNeRFLoc, based on a unified point-based representation. On one hand, PNeRFLoc supports the initial pose estimation by matching 2D and 3D feature points as traditional structure-based methods; on the other hand, it also enables pose refinement with novel view synthesis using rendering-based optimization. Specifically, we propose a novel feature adaption module to close the gaps between the features for visual localization and neural rendering. To improve the efficacy and efficiency of neural rendering-based optimization, we also developed an efficient rendering-based framework with a warping loss function. Extensive experiments demonstrate that PNeRFLoc performs the best on the synthetic dataset when the 3D NeRF model can be well learned, and significantly outperforms all the NeRF-boosted localization methods with on-par SOTA performance on the real-world benchmark localization datasets. Project webpage: https://zju3dv.github.io/PNeRFLoc/

Association for the Advancement of Artificial Intelligence: AAAI Publications